#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    DC
    Posts
    3
    Rep Power
    0

    Simple array compare question


    I have two text files, that I am trying to compare and I am not having much luck creating the output I need when I try comparing them.

    The first file auth_users contains a list of users that have been granted a certain permission. Each line has a single entry, a userid. The userid's are not guaranteed to be unique in the file; the file might look something like this:

    user1
    user2
    user1
    user9

    The second file all_users contains userids and the airport code closest to the users office location. Something like this:

    user1 ABQ
    user2 ACY
    ...
    user1000 BIL

    I am trying to take each entry from the array I created from auth_users, match it against the array for the all_user array and then print the corresponding record from the all_users array.

    I am a newbie and I'm sure this is a two second answer for someone whom know what they are doing. I tried a million things but cannot produce the results I am looking for; currently my code looks like this:

    Code:
    authUsers=File.readlines("auth_users.lst")
    allUsers=File.readlines("all_users.lst")
    
    authUsers.sort.uniq.each do |uid|
         put allUsers.any? {uid}
    end
    This seems to do the match (prints true the expected amount of times), but how do I print the matching lines from the allUsers array?

    Any assistance would be greatly appreciated.
  2. #2
  3. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    DC
    Posts
    3
    Rep Power
    0

    got it to work but..


    Is there a better "ruby" way than this?

    Code:
    authUsers = ["user1","user2","user1"]
    allUsers = ["user1 abq","user2 bos","user3 mar"]
    
    authUsers.sort.uniq.each do |authorized|
    
            allUsers.each do |user|
    
                    puts user if (user[authorized])
            end
    end
  4. #3
  5. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,959
    Rep Power
    1014
    Hi,

    storing data in plain text files is always the worst solution, especially when it's dynamic data like yours and you want to do more than just "get me all entries". Apart from the horrible performance, plain text "databases" are extremely fragile, because there's no consistent structure and no data validation (you can write any nonsense, forget the separator or whatever). And they're prone to editing conflicts.

    I understand that you might not want to install a fully-featured SQL server. But why not use a lightweight database system like http://sqlite-ruby.rubyforge.org/sqlite3/faq.html]SQLite[/URL]? SQLite databases are just files that you can embed in your application and then access with standard SQL (SELECT, DELETE, UPDATE etc.)

    If you really have no other way than using text files, then you should at least parse the entries properly. Your current solution doesn't work, because it uses substrings. For example, the user "foo" will get the data from any user name that has "foo" in it like "foobar".

    You seem to think that Ruby has some kind of magical powers to understand what you mean. That's not the case. You can't just write down "any? {uid}" or "user[authorized]". You have to tell it what you actually want.

    For example:
    Code:
    users = [
      "tom ABC",
      "bill DEF",
      "peter GHI"
    ]
    auth_users = [
      "tom",
      "peter"
    ]
    
    users.each do |user|
      name, code = user.split		# split string at space
      puts "#{name} with code #{code} is an authorized user" if
        auth_users.include? name
    end
    Note that when you use your actual text files, the script has to remove trailing whitespace first.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    DC
    Posts
    3
    Rep Power
    0

    I over simplfied my example


    Jacques1 thank you for your response, very helpful in getting a better solution to my problem. I will definitely change the way I match things against the full user list to ensure I do not get any false positive matches (where userid is the same as an airport code for example).

    I simplified the problem statement to just cover the issue I was having currently. The data I am accessing comes from intranet accessible web pages, that I am grabbing with open-uri. The list of authorized users has dup userids, I'll look into removing dups and spaces before placing the data into an array. The list of all users is strictly formatted data, being generated from an oracle database that I don't have access to. I am not worried about any data inconsistencies in the all users list.

    I'm replacing a simple shell script that I've already created, in attempt to get more familiar with Ruby. It simply compares the two lists and emails the users/office locations of all authorized folks in the department to set of managers. Its a pretty simple script which I don't think warrants the use of a db.

    I appreciate your assistance, you definitely having me going in the right direction again!
  8. #5
  9. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,959
    Rep Power
    1014
    I still don't see the point of doing all kinds of sorting, duplicate removing etc. with the authorized users list when you say that your users list has more "clean" data, anyway.

    Why not loop through this list and check for every user is he's on the authorized users list like I did? All you have to do then is remove empty lines and unwanted space from the authorized users:
    Code:
    auth_users =
      File.foreach("auth_users.lst").map(&:strip).reject(&:empty?)
    Duplicates or a wrong order don't matter.

    Regarding the "false positives": The only way to prevent this is by actually splitting the lines. Don't rely on substring magic or something.

IMN logo majestic logo threadwatch logo seochat tools logo