
March 11th, 2013, 05:03 PM
|
|
Registered User
|
|
Join Date: Mar 2013
Posts: 3
Time spent in forums: 25 m 6 sec
Reputation Power: 0
|
|
De-duplicate 'smart' random row query?
Hi guys,
following some research, I found a query suitable for my needs: it randomly returns IDs from the table. The ID field is an auto-increment, so there are no holes.
Code:
SELECT `mydb`.`myTable`.id
FROM (SELECT Floor (Rand() * (SELECT Count(*)
FROM `mydb`.`myTable`)) num,
@num := @num + 1
FROM (SELECT @num := 0) a,
`mydb`.`myTable`
LIMIT 2000000) b,
`mydb`.`myTable`
WHERE b.num = `mydb`.`myTable`.id
The issue I experience is that the target table (myTable) contains 30-400M records, depending on the situation. In the LIMIT, I want to retrieve 2M randomly selected IDs, however I get a lot of duplicates (which is expected).
Is it possible to de-duplicate the query and yet receive 2M records? I thought to create a table and let it manage the UNIQUE values, but again I will get less than expected.
Any thoughts? Many thanks!
|