Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Import related tags to articles (Categories) #1248

Merged
merged 14 commits into from
Mar 17, 2021

Conversation

LinkJim
Copy link
Contributor

@LinkJim LinkJim commented Mar 13, 2021

Authors : Jimmy HUYNH (@LinkJim) - Marco NASSABAIN (@mnassabain)

🚀 Feature

Import related tags for each article

  • Could be useful to sort articles by tags (in future...)
  • Could be integrated in front-end to show keywords
  • Could be used to generate hashtags for sharing on social media (e.g.: Twitter)
  • UX / UI Friendly !

We are already working on sharing articles with users and on social media. (pr existants)

In case of sharing articles on social medias, we thought that sharing posts, would be more relevant if hashtags were automatically generated. Being that generating tags based on a text is a complex problem, we would use the tags retrieved from the RSS feeds instead.

This is an example of usage that the newly added "tags" field would offer. But we think that there are surely more advantages.

💻 Implementation

  • We added a migration that inserts a "tags" column for news_items.
  • We updated the Item model to include the "tags" field.
  • In the FeedFetcher, we simply import the tags using the "categories" tag from the RSS feed.

💡 Questions

  • In our current implementation the "tags" column is a "text" field. We thought that it is more appropriate than a string with a defined length?
  • This current implementation only adds tags for newly imported articles. Already existing articles in the database will not be updated.

✅ To do

  • Add tests
  • Update already existing items to fetch their tags? Would this be possible?

Any recommandation would be appreciated, thanks. ;-)

Jimmy Huynh and others added 4 commits March 13, 2021 00:37
Signed-off-by: Jimmy Huynh <jimmy.huynh@etu.unistra.fr>
Signed-off-by: Jimmy Huynh <jimmy.huynh@etu.unistra.fr>
Signed-off-by: Jimmy Huynh <jimmy.huynh@etu.unistra.fr>
Signed-off-by: Marco Nassabain <marco.nassabain@hotmail.com>
Copy link
Contributor

@SMillerDev SMillerDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine to me. But it will need tests.

@anoymouserver
Copy link
Contributor

✅ To do

  • Update already existing items to fetch their tags? Would this be possible?

If an article has been changed at the actual feed provider, it is automatically updated (old items, which no longer exist in the feed, are not updated for obvious reasons). But since this is 'only' a transitional problem, I honestly wouldn't do anything to force-refetch it.


💡 Questions

  • In our current implementation the "tags" column is a "text" field. We thought that it is more appropriate than a string with a defined length?

Using text seems reasonable, as it's unknown how long the content will be.


I'm not sure if 'tag' is the best name for it, since I personally associate it with a tagging feature for the user for sorting and filtering (NC does the same with 'Collaborative tags').
Maybe using the feed's naming of 'category' would be clearer.

The fetched categories should then also be added into generateSearchIndex.

Jimmy Huynh and others added 3 commits March 14, 2021 23:43
Signed-off-by: Jimmy Huynh <jimmy.huynh@etu.unistra.fr>
Signed-off-by: Jimmy Huynh <jimmy.huynh@etu.unistra.fr>
Signed-off-by: Marco Nassabain <marco.nassabain@hotmail.com>
@mnassabain
Copy link
Contributor

mnassabain commented Mar 15, 2021

To do

  • Rename "tags" to "categories"
  • Return categories in jsonSerialize (we think this is right?)
  • jsonSerialize should return categories as an array
  • Add categories to generateSearchIndex
  • Add tests
  • Use json to implement categories
  • Update changelog

mnassabain and others added 5 commits March 16, 2021 22:03
+ added setter/getters that work with arrays to simplify use case

Signed-off-by: Marco Nassabain <marco.nassabain@hotmail.com>
Signed-off-by: Marco Nassabain <marco.nassabain@hotmail.com>
Signed-off-by: Marco Nassabain <marco.nassabain@hotmail.com>
Signed-off-by: Jimmy Huynh <linkatox@gmail.com>
Signed-off-by: Jimmy Huynh <linkatox@gmail.com>
@LinkJim
Copy link
Contributor Author

LinkJim commented Mar 16, 2021

Thanks for your quick replies and support.

Features and related tests have been successfully implemented according to your guidance.
One thing we're not sure about for the search index. We simply dump the json of the categories into it.. Does it seem correct or is there an eventual convention applied ?

Check it out and let us know about what you think !

@SMillerDev
Copy link
Contributor

The searchindex is a concatenated string of all search terms. So adding something like join('', $this->categories()) would be better.

Signed-off-by: Marco Nassabain <marco.nassabain@hotmail.com>
Signed-off-by: Marco Nassabain <marco.nassabain@hotmail.com>
Copy link
Contributor

@SMillerDev SMillerDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LinkJim!

@SMillerDev SMillerDev merged commit 5ab065b into nextcloud:master Mar 17, 2021
@anoymouserver
Copy link
Contributor

I just wanted to try it on my test instance but found no feed with categories included. Any suggestions?

@mnassabain
Copy link
Contributor

Since it is an optional element (specification), not all feeds have it.

The easiest example of a feed that contains categories is the Nextcloud News feed. You can find it in the explore page, or you can subscribe manually to it via this url: https://nextcloud.com/blogfeed

@anoymouserver
Copy link
Contributor

Yes I know .. I've tried multiple news sites and blogs, but found none with a category tag.
But I didn't even think about the NC feed .. thanks 👍

Grotax added a commit that referenced this pull request Apr 3, 2021
Changed
- Add BATS as integration tests (#1213)
- Update FeedFetcher to import categories from feeds (#1248)
- Update serialization of item to include categories (#1248)
- Make PHPStan stricter (#955)
- Search: Add folder search (#1215)
- Improve test coverage (#1263)
- Allow directly adding a feed without going through the discovery process (#1265)

Fixed
- Do not show deleted feeds in item list (#1214)
- Fix update queries (#1211)
@Grotax Grotax mentioned this pull request Apr 3, 2021
Grotax added a commit that referenced this pull request Apr 3, 2021
Changed
- Add BATS as integration tests (#1213)
- Update FeedFetcher to import categories from feeds (#1248)
- Update serialization of item to include categories (#1248)
- Make PHPStan stricter (#955)
- Search: Add folder search (#1215)
- Improve test coverage (#1263)
- Allow directly adding a feed without going through the discovery process (#1265)

Fixed
- Do not show deleted feeds in item list (#1214)
- Fix update queries (#1211)

Signed-off-by: Benjamin Brahmer <info@b-brahmer.de>
Grotax added a commit that referenced this pull request Apr 3, 2021
Changed
- Add BATS as integration tests (#1213)
- Update FeedFetcher to import categories from feeds (#1248)
- Update serialization of item to include categories (#1248)
- Make PHPStan stricter (#955)
- Search: Add folder search (#1215)
- Improve test coverage (#1263)
- Allow directly adding a feed without going through the discovery process (#1265)

Fixed
- Do not show deleted feeds in item list (#1214)
- Fix update queries (#1211)

Signed-off-by: Benjamin Brahmer <info@b-brahmer.de>
Neo11 pushed a commit to Neo11/news that referenced this pull request May 28, 2022
Changed
- Add BATS as integration tests (nextcloud#1213)
- Update FeedFetcher to import categories from feeds (nextcloud#1248)
- Update serialization of item to include categories (nextcloud#1248)
- Make PHPStan stricter (nextcloud#955)
- Search: Add folder search (nextcloud#1215)
- Improve test coverage (nextcloud#1263)
- Allow directly adding a feed without going through the discovery process (nextcloud#1265)

Fixed
- Do not show deleted feeds in item list (nextcloud#1214)
- Fix update queries (nextcloud#1211)

Signed-off-by: Benjamin Brahmer <info@b-brahmer.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants