Thursday, April 14, 2011

Fetch fields from a table that has the same relation to another table

I'll try to explain my case as good as i can. I'm making a website where you can find topics by browsing their tags. Nothing strange there. I'm having tricky time with some of the queries though. They might be easy for you, my mind is pretty messed up from doing alot of work :P.

I have the tables "topics" and "tags". They are joined using the table tags_topics which contains topic_id and tag_id. When the user wants to find a topic they might first select one tag to filter by, and then add another one to the filter. Then i make a query for fetching all topics that has both of the selected tags. They might also have other tags, but they MUST have those tags chosen to filter by. The amount of tags to filter by differs, but we always have a list of user-selected tags to filter by. This was mostly answered in http://stackoverflow.com/questions/648308/filtering-from-join-table and i went for the multiple joins-solution.

Now I need to fetch the tags that the user can filter by. So if we already have a defined filter of 2 tags, I need to fetch all tags but those in the filter that is associated to topics that includes all the tags in the filter. This might sound wierd, so i'll give a practical example :P

Let's say we have three topics: tennis, gym and golf.

  • tennis has tags: sport, ball, court and racket
  • gym has tags: sport, training and muscles
  • golf has tags: sport, ball, stick and outside

    1. User selects tag sport, so we show all three tennis, gym and golf, and we show ball, court, racket, training, muscles, stick and outside as other possible filters.
    2. User now adds ball to the filter. Filter is now sport and ball, so we show the topics tennis and golf, with court, racket, stick and outside as additional possible filters.
    3. User now adds court to the filter, so we show tennis and racket as an additional possible filter.

I hope I'm making some sense. By the way, I'm using MySQL.

From stackoverflow
  • SELECT   topic_id
    FROM     topic_tag
    WHERE    tag_id = 1
          OR tag_id = 2
          OR tag_id = 3
    GROUP BY topic_id
    HAVING   COUNT(topic_id) = 3;
    

    The above query will get all topic_ids that have all three tag_ids of 1, 2 and 3. Then use this as a subquery:

    SELECT tag_name
    FROM   tag
           INNER JOIN topic_tag
    ON     tag.tag_id = topic_tag.tag_id
    WHERE  topic_id  IN
                         ( SELECT  topic_id
                         FROM     topic_tag
                         WHERE    tag_id = 1
                               OR tag_id = 2
                               OR tag_id = 3
                         GROUP BY topic_id
                         HAVING   COUNT(topic_id) = 3
                         )
       AND
           (
                  tag.tag_id <> 1
               OR tag.tag_id <> 2
               OR tag.tag_id <> 3
           )
    

    I think this is what you are looking for.

    achinda99 : Instead of using IN, you can do an inner join but then you have to rename certain columns and I'm feeling lazy.
  • SELECT DISTINCT `tags`.`tag`
    FROM `tags`
    LEFT JOIN `tags_topics` ON `tags`.`id` = `tags_topics`.`tag_id`
    LEFT JOIN `topics` ON `tags_topics`.`topic_id` = `topics`.`id`
    LEFT JOIN `tags_topics` AS `tt1` ON `tt1`.`topic_id` = `topics`.`id`
    LEFT JOIN `tags` AS `t1` ON `t1`.`id` = `tt1`.`tag_id`
    LEFT JOIN `tags_topics` AS `tt2` ON `tt2`.`topic_id` = `topics`.`id`
    LEFT JOIN `tags` AS `t2` ON `t2`.`id` = `tt2`.`tag_id`
    LEFT JOIN `tags_topics` AS `tt3` ON `tt3`.`topic_id` = `topics`.`id`
    LEFT JOIN `tags` AS `t3` ON `t3`.`id` = `tt3`.`tag_id`
    WHERE `t1`.`tag` = 'tag1'
    AND `t2`.`tag` = 'tag2'
    AND `t3`.`tag` = 'tag3'
    AND `tags`.`tag` NOT IN ('tag1', 'tag2', 'tag3')
    
    finpingvin : Thanks, looks like a "simple" solution. Feels better optimised, performance wise, than doing subqueries.
    Julian Aubourg : Yes, but it's limited to 3 tags.
    finpingvin : But the query can be built dynamically in the application code
  • Select a.topic_id
      from join_table a
     where exists( select *
                     from join_table b
                    where a.tag_id = b.tag_id
                      and b.topic_id = selected_topic )
     group by a.topic_id
     having count(*) = ( select count(*)
                           from join_table c
                          where c.topic_id = selected_topic )
    

    Should give you a list of topics which have all of the tags for selected_topic.

  • Generic solution from the top of my head but prone to have typos:

      CREATE VIEW shared_tags_count AS
      SELECT topic_to_tag1.topic_id AS topic_id1, topic_to_tag2.topic_id AS topic_id2, COUNT(*) as number
        FROM topic_to_tag as topic_to_tag1
             JOIN topic_to_tag as topic_to_tag2
             ON topic_to_tag1.topic_id <> topic_to_tag2.topic_id
                 AND topic_to_tag1.tag_id = topic_to_tag2.tag_id
    GROUP BY topic_to_tag1.topic_id, topic_to_tag2.topic_id;
    
      CREATE VIEW tags_count AS
      SELECT topic_id, COUNT(*) as number
        FROM topic_to_tag
    GROUP BY topic_id
    
    CREATE VIEW related_topics AS
    SELECT shared_tags_count.topic_id1, shared_tags_count.topic_id2
      FROM shared_tags_count
           JOIN tags_count
             ON topic_id=topic_id1
                AND shared_tags_counts.number = tags_count.number
    
    CREATE VIEW related_tags AS
    SELECT related_topics.topic_id1 as topic_id, topic_to_tag.tag_id
      FROM related_topics
           JOIN topic_to_tag
           ON raleted_topics.tag_id2 = topic_to_tag.topic_id
    

    You just have to query the related_tags view.

    Interesting challenge btw.

    finpingvin : That looks complex :P
    Julian Aubourg : mainly because of table names ;) I like to break problems into views. More code but easier to maintain when you have changes in your schema.
    finpingvin : Yes, i can see that :) I've never used views actually, i might have to take a look at that.
  • can we select rows having the same value in a field?

0 comments:

Post a Comment