Thread: Sorting by date

Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2004
    Posts
    188
    Rep Power
    11

    Sorting by date


    I have a 2d array where the first element is the date in the format dd-mm-yyyy and i need to arrange them into date order. I've had a search on here and google but i cant really figure out how i might do this.

    Unless theres an existing perl function that does this automatically i guess i'll have to make a subroutine returning the 1, 0 or -1 depending on conditions but with 3 fields to search through its going to be pretty complicated.

    Do any existing functions do this sort of thing or will i have to write my own, if i do, where might i begin?

    Thanks for any help,

    Lewis.
  2. #2
  3. 'fie' on me, allege-dly
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2003
    Location
    in da kitchen ...
    Posts
    12,889
    Rep Power
    6444
    Change your date format to yyyy-mm-dd, and sort the array
    --Ax
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2004
    Posts
    188
    Rep Power
    11
    Just sort it and it'll work?

    So do i keep the 2d array or just give it the strings?
  6. #4
  7. No Profile Picture
    PerlGuy
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2001
    Posts
    720
    Rep Power
    42
    Not sure I understand what you're asking let me see if I can clarify:

    You have an array of arrays, and the first element of each nested array is a date. You want to sort the array of arrays by this date field so:
    Code:
    $array = [
        [ '02-05-2005', 'elem2', 'elem3' ],
        [ '02-02-2005', 'elem2', 'elem3' ],
    ];
    becomes:
    Code:
    $array = [
        [ '02-02-2005', 'elem2', 'elem3' ],
        [ '02-05-2005', 'elem2', 'elem3' ],
    ];
    - dsb -
    Perl Guy
  8. #5
  9. 'fie' on me, allege-dly
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2003
    Location
    in da kitchen ...
    Posts
    12,889
    Rep Power
    6444
    perldoc sort from your command line

    The date will have to be in YYYY-MM-DD format though, other wise the 01-12-2005 would appear before 02-01-2004, and you don't want that
    --Ax
    without exception, there is no rule ...
    Handmade Irish Jewellery
    Targeted Advertising Cookie Optout (TACO) extension for Firefox
    The great thing about Object Oriented code is that it can make small, simple problems look like large, complex ones


    09 F9 11 02
    9D 74 E3 5B
    D8 41 56 C5
    63 56 88 C0
    Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
    -- Jamie Zawinski
    Detavil - the devil is in the detail, allegedly, and I use the term advisedly, allegedly ... oh, no, wait I did ...
    BIT COINS ANYONE
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2004
    Posts
    188
    Rep Power
    11
    Yes dsb, thats spot on. Sorry i didn't explain it very well.

    I'll have a look into just using sort, it'd be so much easier, seeing as the sort routine i just wrote does nothing to the array!
  12. #7
  13. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,252
    Rep Power
    1810
    Normally when you want to sort on one of the elements, it would be like so:
    Code:
    my @sorted = sort {$a->[0] cmp $b->[0]} @array;
    But since your dates are not that straightforward, it would need to be something like:
    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    
    my @array = (
    	['05-25-1960', 'Mark'],
    	['03-31-1961', 'Heather'],
    	['08-12-1963', 'Lisa'],
    	['11-05-1968', 'Kathryn']
    );
    
    my @sorted = sort by_date @array;
    
    foreach (@sorted) {
    	print $_->[0]."\n";
    }
    	
    sub by_date {
    	my @a = split /-/, $a->[0];
    	my @b = split /-/, $b->[0];
    	$a[2] <=> $b[2] || #year
    	$a[0] <=> $b[0] || #month
    	$a[2] <=> $b[1]; #day
    }
    Although what I usually do is just convert the dates to integers first, and store them that way. Then they sort directly.
  14. #8
  15. No Profile Picture
    PerlGuy
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2001
    Posts
    720
    Rep Power
    42
    Taking into account Ax's good suggestion that you reformat your dates to be properly sortable:
    Code:
    #!/usr/bin/perl
    
    use strict;
    use warnings;
    
    my $arr = [
      [ '02-05-2005', 'elem1', 'elem2' ],
      [ '02-02-2005', 'elem3', 'elem4' ],
    ];
    
    foreach my $i ( @$arr ) {
        $i->[0] =~ s/^(..)-(..)-(....)$/$3$2$1/;
    }
    
    $arr = [ sort { $a->[0] <=> $b->[0] } @$arr ];
    
    # to view your newly sorted data structure
    use Data::Dumper;
    print Dumper( $arr );
    - dsb -
    Perl Guy
  16. #9
  17. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,607
    Rep Power
    4247
    Looks like an good candidate for a Schwartzian Transform
    Code:
    #!/usr/bin/perl -w
    use Data::Dumper;
    $array = [
              ['02-05-2005', 'elem2', 'elem3'],
              ['02-02-2005', 'elem3', 'elem5'],
    ];
    
    # Eat my Schwartz
    my @foo = 
        map {$_->[1]}
    sort {$a->[0] cmp $b->[0]} 
    map { ($dd, $mm, $yyyy) = split '-', $_->[0]; ["$yyyy-$mm-$dd", $_]; } @$array;
    
    print Dumper(@foo);

    Comments on this post

    • keath agrees : # Eat my Schwartz :)
    • dsb agrees : sweet
    • raklet agrees : Nice.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2004
    Posts
    188
    Rep Power
    11
    Now scorpions answer looks fantastic but whoooosh, something just flew over my head. I think im going to have to do some serious reading up to get my head round that, its 4.55 here though so i doubt i'll get it done today.

    Code:
    my @foo = map {$_->[1]} sort {$a->[0] cmp $b->[0]}
    map { ($dd, $mm, $yyyy) = split '-', $_->[0]; ["$yyyy-$mm-$dd", $_]; } @$data;
    @sortedData = @foo;
    As i said i havent had time to go through the code to understand it but should the above actually work, given the array @data as input and @sortedData as output?

    It'd be nice to have it working before i go, then i can concentrate on understanding these maps tomorrow..

    Thanks for taking the time to reply!

    Lewis.
  20. #11
  21. No Profile Picture
    PerlGuy
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2001
    Posts
    720
    Rep Power
    42

    Post


    Assuming you declare $mm, $dd, and $yyyy with my(), it should work.

    It is very neat looking and is probably less complicated than you think it is:

    Code:
    my @foo = 
        map {$_->[1]}
        sort {$a->[0] cmp $b->[0]} 
        map { ($dd, $mm, $yyyy) = split '-', $_->[0]; ["$yyyy-$mm-$dd", $_]; } @$array;
    Take all the code out of the code blocks and this basically works like this:
    Code:
    map BLOCK LIST  # LIST is sort BLOCK LIST
    sort BLOCK LIST # LIST is map BLOCK LIST
    map BLOCK LIST  # LIST is @$array
    The 2nd map returns anonymous arrays with the reformatted date as the first element, followed by the original arrayref. The sort returns the appropriate list item received from the 2nd map. The first map then simply takes the item received from the sort and returns the original arrayref associated with the reformatted date to @foo.
    NOTE: The reformatted date is LEFT OUT of the list that is returned to @foo.

    Another Note: Scorpion's transform benchmarked favorably to my method. So not only is it cooler looking and more impressive, it's faster too
    - dsb -
    Perl Guy
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2004
    Posts
    188
    Rep Power
    11
    When i try it, i just get nothing returned, perhaps if i got a silver spoon with the code on it i could actually get something to work lol!

    Code:
    my ($dd, $mm, $yyyy) = 0;
                    my @foo =
                    map {$_->[1]}
                    sort {$a->[0] cmp $b->[0]}
                    map { ($dd, $mm, $yyyy) = split '-', $_->[0]; ["$yyyy-$mm-$dd", $_]; } @$data;
                    @sortedData = @foo;
    How far off am i?

    Realistically i probably shouldn't do it this way if i dont understand it, and i should probably look up what a map actually is rather than changing things hoping to just bump into the answer...but in the mean time, where might i be going wrong?

    Thanks again.
  24. #13
  25. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,607
    Rep Power
    4247
    Well, I'm driving to work in a couple of mins, but maybe you'll figure it out by the time I get there. If not, I'll post a longer explanation later. You should read the link explaining the Schwartzian Transform above. It is basically a DSU (Decorate-Sort-Undecorate) pattern.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
  26. #14
  27. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2004
    Posts
    188
    Rep Power
    11
    As i'm now going home () i wont be able to look into it any more. Well i might make time to look over that page you linked to, but if not i'll definately take a look over it tomorrow morning as its certainly worth knowing.

    I look forward to your more detailed reply, thanks very much and my apologies for being so damn stupid!
  28. #15
  29. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,607
    Rep Power
    4247
    Ok, I'll try and explain this the best I can. Basically, the sort function in perl has two forms:
    @result = sort @arraylist
    @result = sort custom_function @arraylist
    The first form is used when you have a array to be sorted in alphabetical order.
    Code:
    @data = ("foo", "bar", "quux");
    @result = sort @data;
    The second form is used when you need to sort the array in any other order (for instance, numeric order, reverse order, sort by first 3 characters alone etc.). Basically, you need to supply a custom function for sort to call. For instance, if you want to sort an array of numbers:
    Code:
    @data = (100, 20, 15);
    @result = sort @data; # Wrong. Data is sorted alphabetically
    # @result will be (100, 15, 20) because sort is alphabetical when
    # we don't supply a custom function 
    #
    # Correct way to sort, using a custom function.
    sub sort_num {
       if ($a < $b)
          return -1;
      elsif ($a > $b)
          return +1;
      else
          return 0;
    }
    
    @result = sort sort_num @data;
    # Now we have a correct answer
    When we call sort with a function (sort_num) as the argument, the sort function calls our sort_num function everytime it has to compare two elements. It is up to our special function (sort_num) to return a value (-1, 0 or +1) depending on whether the first argument is less than, equal to or greater than the second one and sort will rearrange the elements according to the return value of our sort_num function.

    With that said, since comparing numbers is such a common operation, perl has a special operator to do this, so we can code sort_num as:
    Code:
    sub sort_num {
        return $a <=> $b;
    }
    which does the same thing. To compare strings, the equivalent operator is cmp instead of <=> operator. Instead of writing a separate sort_num function, you can combine it directly into the sort statement, like this:
    Code:
    @result = sort { $a <=> $b} @data;
    Now that we have an idea how sort works, let's try to work it with our data. In our case, we have to compare the dates. The easiest way to do this is to change the dates to YYYYMMDD format and then do a numeric compare (note that in the example above, I'd done a string compare with cmp, which might be a bad thing after Year 10000 ).
    Code:
    $array = [
              ['02-05-2005', 'elem2', 'elem3'],
              ['02-02-2005', 'elem3', 'elem5'],
    ];
    
    sub sort_func {
        # Split first element of each argument into day/month/year parts
        my ($d1, $m1, $y1) = split '-', $a->[0]; 
        my ($d2, $m2, $y2) = split '-', $b->[0];
    
        # Now convert the dates into YYYYMMDD format
        my $date1 = $y1 . $m1 . $d1;
        my $date2 = $y2 . $m2 . $d2;
        
        # and compare them
        return $date1 <=> $date2;
    }
    
    @result = sort sort_func @$array;
    Of course, you can remove the code from sort_func and add it to the sort call in a block like this:
    Code:
    @result = sort {
        my ($d1, $m1, $y1) = split '-', $a->[0]; 
        my ($d2, $m2, $y2) = split '-', $b->[0];
    
        # Now convert the dates into YYYYMMDD format
        my $date1 = $y1 . $m1 . $d1;
        my $date2 = $y2 . $m2 . $d2;
        
        # and compare them
        return $date1 <=> $date2;
                  } @$array;
    which does the same thing. This should sort your data just fine the way you want. However, there is an inefficiency associated with this. Consider that you have 5 elements to sort. The sort calls sort_func to sort element 1 and element 2. You transform both to YYYYMMDD format, compare the values and return the result. Next, it calls sort_func to compare elements 1 and 3. Again, you transform them to YYYYMMDD format, compare the values and return the result. Then you d the same when it wants to compare elements 1 and 4. Notice that you are transforming element 1 multiple times in these above three comparisions, even though the result for the transformation hasn't changed. It should be possible to somehow perform the transformations only once and remember the result for the future, so it doesn't have to be done over and over again.

    Here's where the DSU pattern comes in. It stands for Decorate, Sort and Undecorate. The trick is to go in once and "Decorate" the array (i.e.) transform it to a format that is easy to sort, then "Sort" the array and then "Undecorate" (i.e.) remove our transformation from the result.

    So, first we tackle the "Decorate" part. We are going to use the DataDumper module so you can see what is going on, every step of the way.
    Code:
    use Data::Dumper;
    $array = [
              ['02-05-2005', 'elem2', 'elem3'],
              ['02-02-2005', 'elem3', 'elem5'],
    ];
    
    @decorate = map { ($dd, $mm, $yyyy) = split '-', $_->[0]; ["$yyyy$mm$dd", $_]; } @$array;
    print Dumper(@decorate);
    Basically, the map keyword works like this:
    @result = map function_name @array;
    What is does is go through each element of @array and calls function_name for every element. It then returns the results of function_name in another array. So, in the above expression, if you print out @decorate, you will see that we're merely creating a 2-d array with 4 columns now. The last 3 columns are identical to the original array. The first column is merely the date arranged in YYYYMMDD format for each element.

    Now for the "Sort" part.
    Code:
    @sorted = sort {$a->[0] <=> $b->[0]} @decorate;
    print Dumper(@sorted);
    This is pretty simple stuff here. Our custom sort function merely compares the first column (i.e. element 0) of each 2-d array (which is the date in YYYYMMDD) format and returns a @sorted array

    Now for the "Undecorate" part. We need to remove the first column from each element. We can use a map to do this:
    Code:
    @undecorate = map {$_->[1]} @sorted;
    print Dumper(@undecorate);
    All we're doing with map here is returning everything except the first element (which is our transformed YYYYMMDD for each row) back to @undecorate.

    Now, we can combine these three together into one statement.
    Code:
    @result = map {$_->[1]}                    # Undecorate
                  sort {$a->[0] <=> $b->[0]}  # Sort
                  map { ($dd, $mm, $yyyy) = split '-', $_->[0]; ["$yyyy$mm$dd", $_]; } @$array; # Decorate
    Working this statement backwards, you see we're "Decorating", passing the result to "Sort" and then passing that result to "Undecorate". There you have it, a classic Schwartzian (or DSU) transform.

    Hope this all makes sense.

    Comments on this post

    • doof205 agrees : Fantastic!
    • ishnid agrees
    • Axweildr agrees : A shining lighty among 'answers', the epitome of all answers ... maith an buachaill thu
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo