Question regarding keyword checkboxes in database design
Apologies in advance, because this post might be long, but I'm not sure how to explain what I need to for this question to make sense otherwise.
I'm a total PostgreSQL beginner, working in a contract job that initially involved developing a database in FileMaker Pro and then evolved into porting it to the web using PostgreSQL because FMP web hosting was too expensive. So now I'm working on all kinds of thorny conversion issues. I've more or less given up on getting the two to interact the way I'd initially hoped, using a third-party SQL plug-in for FMP, because there are too many differences in the way they store data. So it's probably going to end up being only in Pg, and with people who update it after my contract ends using a web form to do so. But in the meantime, I need to get the existing FMP database totally converted over to Pg.
So today's conversion question -- or set of questions, really -- relates to the several lists of checkboxes in the current FMP database, for keywords under which the images, artists and exhibitions are categorized (the database is for an art gallery). In FMP, these are handled by using one field per *set* of checkboxes, with a value list associated with it that provides the different checkboxes or radio buttons in the layouts, but stores the data as one long string delimited by some weird character that doesn't display properly when you export the content as text.
This works fine in FMP, but my impression from the book I'm currently reading ("Beginning Databases with PostgreSQL" by Richard Stones and Neil Matthew) is that in Pg only one type of information should be stored in each column, so each keyword checkbox should probably be one column. So keeping the current structure where the keywords in any given set (i.e. subject, medium, etc.) are all lumped together into one long string for each record would be considered bad database design and probably make scripting trickier. But if I make each keyword into a boolean field, I'm not sure how to display them. From what I've read thus far, the select statements that you use to display data in Pg usually return the data in the columns, not the column names. When people ultimately view the database records on the web, I want them to be able to see something like:
Medium: black and white photograph
Subject: landscape, nature
Processes: hand-tinting, photocollage
Or something to that effect, but NOT:
black and white photograph: yes
So I'm thinking that probably each set of keywords (medium, subject, etc.) probably needs to be a table and each keyword a boolean column within it, but I'm not sure if there's a way to modify a select statement so that it outputs the names of all columns with a yes/true/1 in them, rather than outputting the values themselves.
Or even, for that matter, if this really is the best way of doing it or not. All three of the main data files - artists, exhibitions and images - need to reference these keywords, so it wouldn't really make sense to include them within each of those tables (despite the fact that that's currently the way it's set up in FMP). It seems more logical to make each set a table and then use a combination of the id number from the artist/image/etc. record plus a code like "a" for artist, "i" for image and "e" for exhibition as a primary key to relate each record in the keyword tables back to record the keywords are describing. Does that make sense? Or is there another way that would be better?
Sorry for the length of this, but like I said, I'm new to Pg and I wasn't sure how else to explain it. BTW, I'll be using PHP to create the front end of this for the web site, if that makes a difference.
I hope I understand you correctly. You have different types of keywords. Each record can have multiple keywords for each keyword type.
Here's a suggestion. You could have one table for just keywords. Each keyword would be linked to both an item and to a keyword type. When querying for data, you could group by the keyword type.