Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taggings Unique Index performance issue #509

Closed
jnimety opened this issue Apr 9, 2014 · 19 comments
Closed

Taggings Unique Index performance issue #509

jnimety opened this issue Apr 9, 2014 · 19 comments
Labels
Milestone

Comments

@jnimety
Copy link

jnimety commented Apr 9, 2014

We ran into a performance issue after running the taggings unique index migration. We're using mysql and after the migration the query plan for #tag_list is performing a full table scan on the tags table.

With the unique index, full table scan of tags required:

mysql> show create table taggings\G
*************************** 1. row ***************************
       Table: taggings
Create Table: CREATE TABLE "taggings" (
  "id" int(11) NOT NULL AUTO_INCREMENT,
  "tag_id" int(11) DEFAULT NULL,
  "taggable_id" int(11) DEFAULT NULL,
  "taggable_type" varchar(255) DEFAULT NULL,
  "tagger_id" int(11) DEFAULT NULL,
  "tagger_type" varchar(255) DEFAULT NULL,
  "context" varchar(255) DEFAULT NULL,
  "created_at" datetime DEFAULT NULL,
  PRIMARY KEY ("id"),
  UNIQUE KEY "taggings_idx" ("tag_id","taggable_id","taggable_type","context","tagger_id","tagger_type")
)
1 row in set (0.01 sec)

mysql> explain SELECT `tags`.* FROM `tags` INNER JOIN `taggings` ON `tags`.`id` = `taggings`.`tag_id` WHERE `taggings`.`taggable_id` = 195395  AND `taggings`.`taggable_type` = 'TskTask' AND (taggings.context = 'tags' AND taggings.tagger_id IS NULL)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tags
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 12811
        Extra:
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: taggings
         type: ref
possible_keys: taggings_idx
          key: taggings_idx
      key_len: 1551
          ref: control_jnimety.tags.id,const,const,const,const
         rows: 1
        Extra: Using where; Using index
2 rows in set (0.00 sec)

Without the unique index:

mysql> show create table taggings\G
*************************** 1. row ***************************
       Table: taggings
Create Table: CREATE TABLE "taggings" (
  "id" int(11) NOT NULL AUTO_INCREMENT,
  "tag_id" int(11) DEFAULT NULL,
  "taggable_id" int(11) DEFAULT NULL,
  "taggable_type" varchar(255) DEFAULT NULL,
  "tagger_id" int(11) DEFAULT NULL,
  "tagger_type" varchar(255) DEFAULT NULL,
  "context" varchar(255) DEFAULT NULL,
  "created_at" datetime DEFAULT NULL,
  PRIMARY KEY ("id"),
  KEY "index_taggings_on_tag_id" ("tag_id"),
  KEY "index_taggings_on_taggable_id_and_taggable_type_and_context" ("taggable_id","taggable_type","context")
)
1 row in set (0.00 sec)

mysql> explain SELECT `tags`.* FROM `tags` INNER JOIN `taggings` ON `tags`.`id` = `taggings`.`tag_id` WHERE `taggings`.`taggable_id` = 195395  AND `taggings`.`taggable_type` = 'TskTask' AND (taggings.context = 'tags' AND taggings.tagger_id IS NULL)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: taggings
         type: ref
possible_keys: index_taggings_on_tag_id,index_taggings_on_taggable_id_and_taggable_type_and_context
          key: index_taggings_on_taggable_id_and_taggable_type_and_context
      key_len: 1541
          ref: const,const,const
         rows: 1
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: tags
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: control_jnimety.taggings.tag_id
         rows: 1
        Extra:
2 rows in set (0.00 sec)
@bf4
Copy link
Collaborator

bf4 commented Apr 13, 2014

I'll have to look at this and think a bit. If you think you can help, please do. This repo needs it!

@jnimety
Copy link
Author

jnimety commented Apr 14, 2014

@jhirbour (mysql guru) and I took a more detailed look at this and couldn't figure out why the tags index is not getting used. We tried a few things to force the index but no joy. @jhirbour, if you have anything to add please do. Otherwise this is far beyond my mysql skill set and rolling back to the previous indexes fixed the perf issue.

@jhirbour
Copy link

Hey I just got back from vacation. I'll look at this during this week.

After some initial testing over IRC we determined that at first glance this isn't an issue with MySQL picking the right index (troubleshot by index hinting).

@jnimety can you tell me how many rows you have in: tags and in taggings just so I'm testing at scale.

@bf4
Copy link
Collaborator

bf4 commented Apr 27, 2014

Could this be related to #516

@jnimety
Copy link
Author

jnimety commented Apr 28, 2014

@jhirbour

2.1.1 :006 > ActsAsTaggableOn::Tag.count
D, [2014-04-28T14:39:55.160609 #3627] DEBUG -- :    (0.3ms)  SELECT COUNT(*) FROM `tags`
 => 13459
2.1.1 :007 > ActsAsTaggableOn::Tagging.count
D, [2014-04-28T14:39:57.257195 #3627] DEBUG -- :    (0.3ms)  SELECT COUNT(*) FROM `taggings`
 => 865510

@jnimety
Copy link
Author

jnimety commented Apr 28, 2014

@bf4 great lead. I bet that's it, in our situation tagger_id and tagger_type are all NULL.

@avitus
Copy link

avitus commented Apr 28, 2014

We're seeing that same performance issue (page load has gone from 0.8 sec to 28 sec) for #tag_list and we have no tagger_id or tagger_type values that are NULL.

irb(main):036:0> ActsAsTaggableOn::Tag.count
=> 11182
irb(main):037:0> ActsAsTaggableOn::Tagging.count
=> 139541

@bf4
Copy link
Collaborator

bf4 commented Apr 29, 2014

re: @avitus "page load has gone from 0.8 sec to 28 sec"

holy moly. I'm assuming you've rolled back to the previous release? Do you feel comfortable helping debug this? I'm no longer working on the app that used the plugin when I wrote the new index code.

@avitus
Copy link

avitus commented Apr 29, 2014

@bf4 Yep, I did rollback for now. On the positive side I did clean up a lot of duplicate tags 😃

I can take another look. I did take a brief look and it wasn't clear to me why it would be so slow.

@bf4
Copy link
Collaborator

bf4 commented Apr 29, 2014

Perhaps we should yank the last release and put a note in the README until we figure it out? OTOH, if you already ran the migration, and are running the previously version, then the slowness is likely in acts-as-taggable-on, not the migration..

@avitus
Copy link

avitus commented Apr 29, 2014

I only rolled back the migration so I think the problem lies in the new index. I don't understand why it would be so slow, though.

@seuros seuros added this to the 3.3.0 milestone May 1, 2014
@seuros seuros added the bug label May 1, 2014
@bf4
Copy link
Collaborator

bf4 commented May 2, 2014

what database and rails version are you on?

@avitus
Copy link

avitus commented May 2, 2014

Rails 4.0.3 and MySQL 5.5

@dalpo
Copy link

dalpo commented May 15, 2014

Any news about this issue?

@pcai
Copy link
Contributor

pcai commented Jun 3, 2014

This is because the unique compound index on taggings has a left-most column of tag_id. Using the example query given earlier:

mysql> explain SELECT `tags`.* FROM `tags` INNER JOIN `taggings` 
ON `tags`.`id` = `taggings`.`tag_id` WHERE `taggings`.`taggable_id` = 195395  
AND `taggings`.`taggable_type` = 'TskTask' AND (taggings.context = 'tags' 
AND taggings.tagger_id IS NULL)

The query cannot use the index because none of the conditions include the tag_id column. This is intuitive if you try to find all words in the english dictionary where the second letter is 'a'. It cannot be done without scanning the entire dictionary; the index is no help at all because the start of the index is on the first letter.

My solution to this is to add back the compound index that identifies the taggable in its left-most column(s):

add_index :taggings, [:taggable_id, :taggable_type, :context]

I also included context in this index, although this won't make much difference in most cases assuming the cardinality is low.

The practical effect of this, as with all indexes, is it will slightly slow table writes and use up slightly more storage space and tablespace memory.

@seuros
Copy link
Collaborator

seuros commented Jun 3, 2014

@pcai , do you mind sending a PR ?

@mgtf
Copy link

mgtf commented Jun 11, 2014

i dont know if it helps but i'm sucessfully using something like :

@all_tags = ActsAsTaggableOn::Tag.joins(:taggings).where(:taggings => {:context => 'tags'}).select('DISTINCT tags.name').map{ |x| x.name}
@all_authors = ActsAsTaggableOn::Tag.joins(:taggings).where(:taggings => {:context => 'authors'}).select('DISTINCT tags.name').map{ |x| x.name}

@pcai
Copy link
Contributor

pcai commented Jun 11, 2014

I've opened a pull request, but I only added back the index. Didn't do any testing, automated or otherwise.

@seuros
Copy link
Collaborator

seuros commented Jun 11, 2014

@jnimety can you try with master ?
@pcai send a PR that i merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants