#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    2
    Rep Power
    0

    Help! Stuck on methods to count file size.


    My program requires to read an input file with 3 columns(month dates, fileSize, fileName)in 3 steps: Here I only sample 2 months dates in Julian year order to give an example.

    1. count the total number of files loaded in each month.
    2. count the total file size for each month.
    3. count the largest file size in each month.

    I need help to correct my code; when I run it, the output is not in individual month but in whole months. I also need the output in Descending order by file size for individual month. The month date and file name should automatically match their file size as well, so I can find the largest file with its date and file name. I am not sure which code can handle it. Much appreciated if anyone could improve my code to approach the solution.

    Here is my input file format.
    Second column is the file size.
    Code:
    001 175 FILENAME
    002 1856 FILENAME
    003 177 FILENAME
    032 175 FILENAME
    033 2345 FILENAME
    034 175 FILENAME
    Here is my code:

    Code:
    #!/usr/bin/perl
         use strict;
    use warning;
    use Data::Dumper;
    use File::Find;
    use File::stat;
    use sort 'stable';
    
    my $filin = '/root/scripts/newsort.in';
    my $fleot = '/root/scripts/results/size.out';
    
    open my $fh, $filin || die $!;
    open my $fot, ">$fleot" || die $!;
    
    
    ##Define month lengths
    @Janlen = ( '006', '007', '008', '009', '010', '011', '012', '013', '0
    +14', '015', '016', '017', '018', '019', '020', '021', '022', '023', '
    +024', '025', '026', '027', '028', '029', '030', '031' );
    @Feblen = ( '032', '033', '034', '035', '036', '037', '038', '039', '0
    +40', '041', '042', '043', '044', '045', '046', '047', '048', '049', '
    +050', '051', '052', '053', '054', '055', '056', '057', '058', '059' )
    +;
    #Define month hash
    %mthlens = (@Janlen, @Feblen);
    my @julens = %mthlens;
    my $julias = @julens;
    my $Janlias = @Janlen;
    my $Feblias = @Feblen;
    my $Marlias = @Marlen;
    
    while (%mthlens=<$fh>){
    chomp;
    
    my %lengths = map { $_ => length $_ } %mthlens;
    while ( my ($Janlen,$length,$filename) = each 
    %lengths) { 
         @s = sort { $length{$b} <=> $length{$a}} keys 
     %length;
         print join("\t", $Janlen,  $length, $filename 
    ), "\n";
       } 
    }
    Here is my output file format.
    File sizes are displayed in the second column. I am not sure what are the numbers following FILENAME, such as 38, 33 , 38 ....

    Code:
    024 710 FILENAME 
            38
    114 923 FILENAME
            33
    044 367 FILENAME
            38
    083 7864 FILENAME
            39
    153 783 FILENAME
            33
    084 864 FILENAME
    Very appreciated for any input!

    Thank you!
  2. #2
  3. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,251
    Rep Power
    1810
    My program requires to read an input file with 3 columns(month dates, fileSize, fileName)in 3 steps: Here I only sample 2 months dates in Julian year order to give an example.
    Is it going to be a file? Because you are importing File::Find and File::stats, but not using them in your example.

    I only mention that because of all the work you are doing trying to determine the month, and that is avoidable if you can get the timestamps directly from the files.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    2
    Rep Power
    0
    Originally Posted by keath
    Is it going to be a file? Because you are importing File::Find and File::stats, but not using them in your example.

    I only mention that because of all the work you are doing trying to determine the month, and that is avoidable if you can get the timestamps directly from the files.
    Thank you for your pointting out of it. It required to use the month not the calender year timestamp. I sorted the length but it doesn't work. I try to desending the length column to get the biggest file at the top. Currently I have to see the sort function working but it doesn't.
  6. #4
  7. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,251
    Rep Power
    1810
    What I mean about the timestamp is that it is just an integer, but you can get the month, day, year and time (hours, minutes, seconds) all from that one integer.

    But instead you provided something that looks like a day:

    Here I only sample 2 months dates in Julian year order to give an example.
    Code:
    001 175 FILENAME
    002 1856 FILENAME
    003 177 FILENAME
    032 175 FILENAME
    033 2345 FILENAME
    034 175 FILENAME
    But I don't recognize those three digit numbers in the first column as being months. I can explain how to sort on any column if you want, but I don't recognize your date format.

IMN logo majestic logo threadwatch logo seochat tools logo