Discuss How to flatten a MYSQL table? in the MySQL Help forum on Dev Shed. How to flatten a MYSQL table? MySQL Help forum discussing administration, SQL syntax, and other MySQL-related topics. MySQL is an open-source relational database management system (RDBMS).
Receive the tools necessary to be the rock star of your field. Our 12-month program teaches you the evolving world of multi-channel marketing as well as the complex issues and opportunities found in the industry.
ASP Free and Iron Speed Designer are giving away $5,500+ in FREE licenses. Iron Speed's RAD CASE toolset can save up to 80% of your coding time. One free license per week, one perpetual license per month! Download and Activate to enter!
Web development can be a daunting task, even for specialists. There is a lot of information to absorb and a lot of technologies to learn in order to manage a superior website. When trying to learn the ropes, developers need a reliable source to introduce new ideas that can be easily implemented. When working on large projects, even web veterans may run into a technology or an aspect of a technology that they are unfamiliar with.
Posts: 13
Time spent in forums: 3 h 16 m 38 sec
Reputation Power: 0
How to flatten a MYSQL table?
I'm new using MYSQL database and I would like some help to flatten a mysql table. Suppose I have the following table:
UID GO
Q9NQG7 GO:0005764
Q9NQG7 GO:0042470
P02GN1 GO:0005624
P02GN1 GO:0003461
Instead of having the UID repeated along side the GO terms, I want to have the two different UID grouped into one column and all the GO terms into a row. In this case the flattened file would be in the following.
Posts: 13
Time spent in forums: 3 h 16 m 38 sec
Reputation Power: 0
Quote:
Originally Posted by r937
are you looking to create a new table? or are you just interested in pulling out the data from your existing table?
also, why are you interested in "flattening" this data? what's wrong with the way it's stored now?
Yes, I'm looking to create a new table. Some of the data in the table have similar records. My goal is to eliminate the duplicate and create two rows.
A section of my thesis is to run a database using one of my machine learning algorithms and compare the performances, but in order to do so I have to flatten the data and I'm having some difficulties.
Posts: 25,046
Time spent in forums: 3 Months 2 Days 22 h 44 sec
Reputation Power: 3829
here ya go...
Code:
SELECT UID
, MAX(CASE WHEN GO = '0005624'
THEN 1 ELSE 0 END) AS '0005624'
, MAX(CASE WHEN GO = '0005764'
THEN 1 ELSE 0 END) AS '0005764'
, MAX(CASE WHEN GO = '00042470'
THEN 1 ELSE 0 END) AS '00042470'
, MAX(CASE WHEN GO = '0003461'
THEN 1 ELSE 0 END) AS '0003461'
FROM daTable
GROUP
BY UID
Posts: 13
Time spent in forums: 3 h 16 m 38 sec
Reputation Power: 0
Quote:
Originally Posted by r937
here ya go...
Code:
SELECT UID
, MAX(CASE WHEN GO = '0005624'
THEN 1 ELSE 0 END) AS '0005624'
, MAX(CASE WHEN GO = '0005764'
THEN 1 ELSE 0 END) AS '0005764'
, MAX(CASE WHEN GO = '00042470'
THEN 1 ELSE 0 END) AS '00042470'
, MAX(CASE WHEN GO = '0003461'
THEN 1 ELSE 0 END) AS '0003461'
FROM daTable
GROUP
BY UID
Thank you so much the query works great! But once I test the query with repeated GO terms it doesn't work correctly. It gives me only "0" output. For example once I test the following test case:
Posts: 1,914
Time spent in forums: 1 Month 5 Days 11 h 34 sec
Reputation Power: 1297
If the values in your GO column actually are as you show (i.e., GO:0042470) then you'll need to change the CASE statements to look for that style of value:
Code:
SELECT UID
, MAX(CASE WHEN GO = 'GO:0005624'
THEN 1 ELSE 0 END) AS '0005624'
, MAX(CASE WHEN GO = 'GO:0005764'
THEN 1 ELSE 0 END) AS '0005764'
, MAX(CASE WHEN GO = 'GO:0042470'
THEN 1 ELSE 0 END) AS '0042470'
, MAX(CASE WHEN GO = 'GO:0003461'
THEN 1 ELSE 0 END) AS '0003461'
FROM daTable
GROUP
BY UID
__________________
The moon on the one hand, the dawn on the other:
The moon is my sister, the dawn is my brother.
The moon on my left and the dawn on my right.
My brother, good morning: my sister, good night.
-- Hilaire Belloc
Posts: 13
Time spent in forums: 3 h 16 m 38 sec
Reputation Power: 0
Quote:
Originally Posted by SimonJM
If the values in your GO column actually are as you show (i.e., GO:0042470) then you'll need to change the CASE statements to look for that style of value:
Code:
SELECT UID
, MAX(CASE WHEN GO = 'GO:0005624'
THEN 1 ELSE 0 END) AS '0005624'
, MAX(CASE WHEN GO = 'GO:0005764'
THEN 1 ELSE 0 END) AS '0005764'
, MAX(CASE WHEN GO = 'GO:0042470'
THEN 1 ELSE 0 END) AS '0042470'
, MAX(CASE WHEN GO = 'GO:0003461'
THEN 1 ELSE 0 END) AS '0003461'
FROM daTable
GROUP
BY UID
Thank you! This is absolutely awesome! I'm going to test a larger data and see how it goes.
Posts: 13
Time spent in forums: 3 h 16 m 38 sec
Reputation Power: 0
Quote:
Originally Posted by cjassi08
Thank you! This is absolutely awesome! I'm going to test a larger data and see how it goes.
Hi, thank you so much for your help! The SQL command works great for small databases, but I can't use it on larger data because I have to manually entered every single lines manually. I'm using a pretty large data for my thesis and I need help setting up a program to help somehow automatically give the same outcome without having to write thousand of lines manually. Thank you!
Posts: 25,046
Time spent in forums: 3 Months 2 Days 22 h 44 sec
Reputation Power: 3829
Quote:
Originally Posted by cjassi08
...without having to write thousand of lines manually. Thank you!
thousands?
you prolly won't be able to do that
writing thousands of CASE expressions implies you want a row that has thousands of columns
do a search in the mysql manual for this:
Quote:
There is a hard limit of 4096 columns per table, but the effective maximum may be less for a given table. The exact limit depends on several interacting factors, listed in the following discussion.
and then read the following discussion
you're probably going to want to leave your table narrow (two columns) and do whatever analysis you need to do in some specialized software
Posts: 1,914
Time spent in forums: 1 Month 5 Days 11 h 34 sec
Reputation Power: 1297
APL, from what I have seen, is a stupendous language for stats and analysis. Also from what I have seen it's a seriously weird language (some of our Nokia keyboards for mainframe terminals had special APL characters)!
Posts: 13
Time spent in forums: 3 h 16 m 38 sec
Reputation Power: 0
Quote:
Originally Posted by SimonJM
APL, from what I have seen, is a stupendous language for stats and analysis. Also from what I have seen it's a seriously weird language (some of our Nokia keyboards for mainframe terminals had special APL characters)!
Thank you, but not only I'm not familiar with the APL, I can't seem to find a way to find some programing tutorials or examples. It probably wont' work for me.
Posts: 13
Time spent in forums: 3 h 16 m 38 sec
Reputation Power: 0
Quote:
Originally Posted by cjassi08
Thank you! This is absolutely awesome! I'm going to test a larger data and see how it goes.
I would like to output 'True' and 'False' instead of '1' and '0' because my program reject numerical values. The MYSQL is as follow:
SELECT UID
, MAX(CASE WHEN GO = 'GO:0005624'
THEN 1 ELSE 0 END) AS '0005624'
, MAX(CASE WHEN GO = 'GO:0005764'
THEN 1 ELSE 0 END) AS '0005764'
, MAX(CASE WHEN GO = 'GO:0042470'
THEN 1 ELSE 0 END) AS '0042470'
, MAX(CASE WHEN GO = 'GO:0003461'
THEN 1 ELSE 0 END) AS '0003461'
FROM daTable
GROUP
BY UID
I have tried the following without any luck:
SELECT UID IF((SELECT CASE WHEN 1>0 THEN 'true' ELSE 'false' END),'true','false');
, MAX(CASE WHEN GO = 'GO:0005624'
THEN 1 ELSE 0 END) AS '0005624'
, MAX(CASE WHEN GO = 'GO:0005764'
THEN 1 ELSE 0 END) AS '0005764'
, MAX(CASE WHEN GO = 'GO:0042470'
THEN 1 ELSE 0 END) AS '0042470'
, MAX(CASE WHEN GO = 'GO:0003461'
THEN 1 ELSE 0 END) AS '0003461'
FROM daTable
GROUP
BY UID
Posts: 2,279
Time spent in forums: 1 Month 1 Day 22 h 39 m 15 sec
Reputation Power: 388
Code:
select UID,
max(case when GO = 'GO:0005624' then 'TRUE' else 'FALSE' end) as "0005624",
max(case when GO = 'GO:0005764' then 'TRUE' else 'FALSE' end) as "0005764",
max(case when GO = 'GO:0042470' then 'TRUE' else 'FALSE' end) as "0042470",
max(case when GO = 'GO:0003461' then 'TRUE' else 'FALSE' end) as "0003461"
from daTable
group
by UID
Posts: 13
Time spent in forums: 3 h 16 m 38 sec
Reputation Power: 0
Quote:
Originally Posted by swampBoogie
Code:
select UID,
max(case when GO = 'GO:0005624' then 'TRUE' else 'FALSE' end) as "0005624",
max(case when GO = 'GO:0005764' then 'TRUE' else 'FALSE' end) as "0005764",
max(case when GO = 'GO:0042470' then 'TRUE' else 'FALSE' end) as "0042470",
max(case when GO = 'GO:0003461' then 'TRUE' else 'FALSE' end) as "0003461"
from daTable
group
by UID