January 2nd, 2013, 05:57 PM
MySQL Insert Location
I need to be able to insert a row into a table into a random location, it can not simply go at the "end" of the table.
I can determine a random place, but how can I specify say? INSERT.... AFTER xfield='value'? Or even after a row number.
I have considered determining a row that would be a random location, replacing that record with the new record and then placing the replaced record in the location of the new record. This is not exact what I was looking for. I was hoping there was something that could be handled by MySQL.
Last edited by seeked; January 2nd, 2013 at 06:07 PM.
January 2nd, 2013, 06:44 PM
You don't. That's not how data is supposed to work.
If you want the results in a certain order then do that in your SELECT.
January 3rd, 2013, 04:06 AM
I understand how it is supposed to work but I am working on a project that REQUIRES that this data is placed in a random location. I know I could get the data out in a random way but for security and anonymity, it cannot be stored chronologically.
Originally Posted by requinix
January 3rd, 2013, 04:39 AM
off the top of my head i don't think that it is possible with mysql directly (maybe ask in the MySQL forum...)
A hacky way would be to create n tables (table1, table2, table3 etc) which all have the same structure etc. When you do your insert statement, randomly pick a table number and use that for the insert. If anyone gets a glimpse at the tables, they will know which was the latest added to table(n) but not the overall chronology. It makes your life a little harder when you select the data etc, but it provides some obfuscation. Obfuscation is the key word here. If you're trying to use this for security then you are either:
A) working with crazy top secret info; or
B) (more likely) failing at security in another point.
If having the data in 1 table is required then use the above method, then do a python script (run by cron) that selects everything from all tables, randomises it and puts it into the main table.
As Requinix said though, that's not how you're meant to use a database, which probably means you will have a hard time trying to get this running smoothly and efficiently.
It's also worth noting that most linux systems will log a heap of stuff behind the scenes (depending on how the admin set them up). If you're not in full control of the server this is running on, then you wont know what's logged and where, and there is probably another way of someone getting this info anyways...
January 3rd, 2013, 06:00 AM
the whole idea is conceptually wrong. When certain people are not supposed to see certain data, the solution is to not let them access it. Give them a view instead. In your case you can simply assign each row a random number on insertion and then ORDER BY that number in your view -- you can also ORDER BY RAND() on the fly, but this is a pretty inefficient solution. What this won't prevent, of course, is people simply keeping track of newly inserted rows.
SQL tables by definition are not ordered (which is what many people don't understand). They represent mathematical sets. That's why something like "INSERT AFTER" doesn't even make sense.
Of course there will always by some kind of order involved due to the fact that the tables have to be stored as bits and bytes. But that's purely technical and happens behind the scenes. When you find yourself trying to change that, you seriously need to rethink your current approach.
Like I said: This is a matter of user management. Do not fumble with the original data.
Last edited by Jacques1; January 3rd, 2013 at 06:04 AM.
January 4th, 2013, 02:33 AM
Perhaps I should elaborate...
I am writing a messaging system. There are two tables. The data to associate the data in one table with another table is encrypted. The random location of the data in each database is to maintain anonymity, not for security (perhaps I misspoke earlier).
Let me reiterate, I UNDERSTAND HOW DATABASES WORK.
I swear, this is why I hate asking questions on forums like this. I have a reason I need to do this. Do you feel better about your selves since you lectured me about "how it is supposed to work"?
January 4th, 2013, 04:25 AM
I'm really sorry that your idea is wrong, but that's the way it is. I don't see the point of lying to you or making up nonsense just so that you can somehow keep the approach and feel good.
I've given you the correct solution for the problem (views). If you don't like it, you'll have to write your own MySQL variant, which stores the bits and bytes in a specific order (using /dev/(u)random or something).
Anybody else will tell you the same. If they don't, they have no clue about SQL.
January 4th, 2013, 04:35 AM
Moving PHP -> MySQL
Thank God you told us right up front exactly what that was. I'm sorry, we really didn't have any reason to suspect you were just some newbie programmer asking for a solution to something you perceived to be a problem. We shouldn't have tried to explain anything and should have simply given you our best answers right off the bat.
Originally Posted by seeked
Obviously you can anonymize and obfuscate the data, so that basically only leaves the problem of knowing (with a high but not perfect degree of accuracy) in what order the data was inserted. I urge you to consider whether this is an actual problem or not, whether knowing that one record is newer than another actually matters.
I believe that MySQL will order the table by the primary key; if you don't need that for the data itself then you could probably muck around with an artificial PK. I'm not sure exactly how though because there'd be potential problems of uniqueness - you couldn't just use a random number, for instance.
Another idea would be shuffling the table every time a row is inserted (or periodically if you can get away with that), or reducing the impact by using multiple tables like sir_drinxalot said.
Also don't forget the other stuff happening behind the scenes. If you're really, really paranoid then you also have to consider data held in memory, the backend files MySQL uses, the database connections...
Comments on this post
January 4th, 2013, 05:11 AM
Ok, I get (sort of) the requirement to have the physical location of the row inserted into the table being 'random' so that it's location cannot be used to extrapolate relative newness of the data. That presupposes that someone is going to be bypassing the database in looking at the raw data on disk or in the file(s).
Usually when you put data into a database you use that database to extract the data for processing as I am very aware that you are fully cognizant of. I think this is where people are getting confused - just how will (or can be) the data be looked at?
The moon on the one hand, the dawn on the other:
The moon is my sister, the dawn is my brother.
The moon on my left and the dawn on my right.
My brother, good morning: my sister, good night.
-- Hilaire Belloc
January 4th, 2013, 05:37 AM
And like I already said: What prevents me from simply fetching the data every minute and writing down the newly inserted rows?
In the end, it's a contradiction in terms: You give people full access to the database, yet you don't really want them to see the data. I mean, are you trying to hide the data from yourself?
Anybody with full access will have full access and can circumvent any obfuscation you may set up. If you don't want people to have full access, then don't give them full access. Let them see a view instead of the raw data.
But I guess this discussion is more of an intellectual game than it's about actual solutions in the real world.
January 4th, 2013, 05:56 AM
To elaborate even further, there are two tables one is a list of randomly generated IDs that is not encrypted. The reason I am keeping this list is to ensure that when a new user is registered, although it is incredibly unlikely, I do not want someone's ID to be duplicated. The ID must not be able to be linked to the user. In a separate table, there is all of the user's information, most of which is encrypted including there ID. Why you ask? Because, when a message is placed into the messages table there are two fields, one for the sender's ID and one for the recipient's ID. The username is not used, and all of the information other than the IDs is encrypted. There is no way to know who is communicating with who. If you wish to send a message to someone you must fist add them as a contact. There is a contact request that is encrypted of course (with openssl) and placed in a requests table. The recipients user name will be used here since the requester does not know there ID yet however, the requester's ID is send to the recipient of the request, encrypted of course. Once the request is accepted there is a message that is sent back to the requester containing the original recipients ID.
At this point a message can be sent using the IDs but from the outside no one knows WHO the messages are to and from unless they have that user's ID. Although, it is worth knowing that the users never actually see the IDs, that all goes on in the background. I suppose if someone got their password and had access to the raw encrypted data within the database they could always decrypt their contact list and see the IDs.
The users ID that is encrypted in the same table as their username is what they will use to reference the messages table and retrieve their messages.
The point of inserting the rows in a random place it, if a user registers in one table their username will be placed and in another table the ID will be placed. If they are both placed in the same places, it will be easy to figure out which ID belongs to which user.
January 4th, 2013, 06:07 AM
Without remembering the details at the time I was talking about InnoDB.
Using Primary Keys
MySQL question, MySQL answer.
Comments on this post
January 4th, 2013, 06:35 AM
dear seeked, all that stuff you wrote about encryption and so on was a bit confusing, but the point remains, you cannot place rows into a "random location" in a table, but as requinix has suggested, the primary key is a clustering index
but you're already generating a random ID -- make that your primary key, and rows will be stored/retrieved in that order
since users never see this ID, there's no way that they can "figure out which ID belongs to which user"
January 4th, 2013, 07:58 AM
I don't care about getting the data out in a particular order, random or otherwise. I need to store it in a random order. I can determine a random place by getting the number of rows. Although the users may not have access to the IDs, anyone who gains access to the database would be able to see the corresponding Username and ID if they are simply inserted chronologically.
Originally Posted by r937
I was simply looking for a way to say, insert as row 7 or insert after this row where this field contains this data, much as you would select a row based on the value of an index field.
January 4th, 2013, 08:58 AM
"insert as row 7" won't work because rows aren't numbered
Originally Posted by seeked
"after this row" won't work because "before" and "after" have no meaning