I’ve been looking at the well known and documented ways to do tagging.
I was undecided between MySQLicious and Toxi methods.
For my use case I currently just retrieve all the tags anyway, so MySQLicious seems like the way to go.
We have an item which has tags associated with it. Currently we fetch all items and as such with them all possible tags. We then do tag filtering/searching on the client side.
However as we scale fetching all the items at once won’t be possible, hence searching for tags on the client side will also become unfeasible.
With MySQLicious we dead straight hit a wall. So Toxi seems like the best solution, however with Toxi we now spend extra time creating tags and also can end up with dangling tags if we are not careful (easily avoidable). But as we scale again the relation poses a problem for fetching tags for the selected items.
Enter jsonb, in PostgreSQL
We have a jsonb column, we can insert an array of tags. Using the json features in Postgre we can then easily search, and also retrieve all tags.
See the below example, which can be ran here https://www.db-fiddle.com/f/qc8ZAvTuDLPAg14bvUqcFR/0
CREATE TABLE test ( name char(30), tags jsonb ); insert into test values ('item1', '["something", "something2"]'); insert into test values ('item2', '["something", "rupert"]'); insert into test values ('item3', '["websearch", "rupert"]'); select * from test; -- select a specific item with all its tags select * from test where name = 'item1'; -- select all items with tag websearch select * from test where tags ? 'websearch'; -- select all items with tag something select * from test where tags ? 'something'; -- select all items with tag something and tag something2 select * from test where tags ? 'something' and tags ? 'something2'; -- select all items with tag websearch or tag something2 select * from test where tags ? 'websearch' or tags ? 'something2'; UPDATE test SET tags = tags || '["banana"]'::jsonb WHERE name = 'item3'; select * from test; UPDATE test SET tags = tags - 'banana' WHERE name = 'item3'; select * from test;
We can the use gin indexes to speed up our searches.
This approach also makes our queries quite simple for both use cases.
The most obvious disadvantage is we still have duplicated tags, just like in the MySQLicious.
Pending performance tests, I suspect this also gives a mid approach to performance for each of the use cases for which MySQLicious and Toxi are better suited.
Never excelling, but never being last. As this becomes slow, I expect the best way forward will be something like solr or elasticsearch.