Page 1 of 3 123 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    Texas
    Posts
    24
    Rep Power
    0

    String to Int conversion speed


    Using Python 2.6, I'm trying to solve a math problem, I need to convert a string with 10^5 (100,000) numeric characters to an integer.

    Using n=int(s) takes it 1 minute and 45 seconds.

    I can live with that, but I need to do the same for 10^6, 10^7, etc. until I find the answer I'm looking for, and the time it takes for the int(s) conversion is getting very long.

    Is there a faster string-to-int conversion that anyone knows off?

    Thanks,

    Ole
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    Your problem looks similar to another recent post which had base 10 integers of 60 thousand digits. For that particular problem there was no reason to convert the string to an integer. Please describe, at a higher level, the problem you're trying to solve so we can consider other approaches.
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    Texas
    Posts
    24
    Rep Power
    0
    Originally Posted by b49P23TIvg
    Your problem looks similar to another recent post which had base 10 integers of 60 thousand digits. For that particular problem there was no reason to convert the string to an integer. Please describe, at a higher level, the problem you're trying to solve so we can consider other approaches.
    For this problem there is a reason to convert it to an integer, as I need to count how many prime factors below 100,000 it contains.

    As an example, an integer consisting of 10^1 (10) 1's, have the following prime factors: (11, 41, 271 & 9091).

    An integer consisting of 10^2 (100) 1's, have the following prime factors (not checking the previous found): (101, 251, 3541, 5051, 21401, 25601, 27961 & 60101).

    At one point (10^n 1's) there will be no more prime factors below 100K. I'm trying to find the ones left over.
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    zowie!

    j (executable Iverson notation) converts a random string of 1e5 digits to an integer in a flash---well under a second. I asked for the prime factors which sent my computer into a page file hell which j was clever enough to abort. I expect that had I merely asked j to check for a 0 remainder on divisions by the primes less than 100000 there'd have been no trouble. OK, there's a problem with old python (or with your code).

    Upgrade to a new python 2.7 or to python3.
    The conversion takes place in a blink.
    Code:
    $ python3
    Python 3.2.3 (default, Oct 19 2012, 19:53:16) 
    [GCC 4.7.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import random,string
    >>> s=''.join(string.digits[random.randrange(10)]for i in range(100000))
    >>> i=int(s)
    >>>
    Alternatively, you could try the problem with the gnu multi-precision library, gmp.

    Oh! You could investigate theorems about repdigits.
    Last edited by b49P23TIvg; November 9th, 2012 at 10:42 AM.
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    Texas
    Posts
    24
    Rep Power
    0
    With all that being said, how long did the i=int(s) take on your computer with Python 3.2.3?

    As far as I recall, I did some speed tests between different Python versions, and I compared 2.6, 2.7 and 3.something. The 2.6 was the fastest (not sure why), which is why I'm still using it.

    Thanks.
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    Takes 13 seconds for 1 conversion of a million digits.

    p.py:

    for i in range(1):
    a = int('1'*(int(1e6)))


    $ time python3 p.py

    real 0m12.385s
    user 0m12.345s
    sys 0m0.008s



    j converts a character array of one million 1's to an extended precision integer in 0.03 seconds. Trust me about the notation or, better yet, learn j.
    Code:
       ts'".''x'',~1e6#''1'''  NB. ts reports time and space
    0.026998 3.04111e7
    Here's a solution in j. j is now open source but wasn't. I think we haven't yet replaced custom arbitrary precision integer code with gmp. gmp is faster. (libgmp is also a huge library. That, and perhaps license problems have prevented python from incorporating it.) The run time for all the divisions gets fairly long. And the space used to solve the problem "all at once" is too big. I've got 4Gb RAM on an INTEL new-nomenclature-2 processor running between 2 and 3 GHz Lenovo T500 laptop computer.
    Code:
       p=: i.&.(p:^:_1)1e5  NB. primes less than 1e5
    
       (4&{.,_4&{.)p        NB. first 4 and last 4, for your happy verification
    2 3 5 7 99961 99971 99989 99991
    
       ]repunit=:".'x',~1e3#'1'   NB. repunit has 1000 ones.
    1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111...
    
       mask=: 1(0=|&repunit)\p
    
       (,mask)#p
    11 41 73 101 137 251 271 401 751 1201 1601 3541 4001 5051 9091 21001 21401 24001 25601 27961 60101 76001
    Last edited by b49P23TIvg; November 9th, 2012 at 11:46 AM.
    [code]Code tags[/code] are essential for python code and Makefiles!
  12. #7
  13. Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Feb 2005
    Posts
    611
    Rep Power
    65
    I simply used module timeit.
    Function int(s) of the 100000 digit string used by b49P23TIvg takes about
    1.2 seconds with Python273 and Python323
    0.4 seconds with Python330
    Last edited by Dietrich; November 9th, 2012 at 12:41 PM.
    Real Programmers always confuse Christmas and Halloween because Oct31 == Dec25
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    Texas
    Posts
    24
    Rep Power
    0
    Thanks guys.

    Before I dive into J, which sounds very interesting, I'd like to "crack the nut" with Python.

    I do not understand how you can do it that fast in Python 2.7 and above.

    Could I ask you a favor and try this code for me?

    Code:
    from datetime import datetime
    
    def timenow():
      ts=str(datetime.now())
      return ts[11:22]
    
    print "Before creating string   : ",timenow()
    
    s1=""
    s2=s1.ljust(10**6,"1")
    
    print "Before converting to int : ",timenow()
    
    n1=int(s2)
    
    print "Int converted            : ",timenow()
    The output from this is:

    Code:
    c:\Python26>python p133st.py
    Before creating string     :  12:25:26.74
    Before converting to int   :  12:25:26.74
    Int converted              :  12:27:13.68
    
    c:\Python26>
    As you can see, it takes about 1:45 minutes to convert a 1,000.000 digit string to an integer on my 2.6.

    This is a pretty new i7 I have.

    Thanks,

    Ole
  16. #9
  17. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    Output: (11 seconds)
    Code:
    $ python p.py
    Before creating string   :  13:44:22.05
    Before converting to int :  13:44:22.05
    Int converted            :  13:44:32.75
    $
    Slight change to the program:
    Code:
    import datetime
    
    now = datetime.datetime(2012,11,9).now#year, month, day[, hour[, minute[, second[, microsecond[,tzinfo]]]]])
    
    def timenow():
      ts=str(now())
      return ts[11:22]
    
    print "Before creating string   : ",timenow()
    
    s1=""
    s2=s1.ljust(10**6,"1")
    
    print "Before converting to int : ",timenow()
    
    n1=int(s2)
    
    print "Int converted            : ",timenow()
    Last edited by b49P23TIvg; November 9th, 2012 at 12:50 PM.
    [code]Code tags[/code] are essential for python code and Makefiles!
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    Texas
    Posts
    24
    Rep Power
    0
    Wow, that's quite an improvement in speed - roughly 10 times faster.

    What version of Python did you run this on?

    Thanks again for your help.
  20. #11
  21. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    $ python
    Python 2.7.3 (default, Sep 26 2012, 21:51:14)
    [GCC 4.7.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>>
    [code]Code tags[/code] are essential for python code and Makefiles!
  22. #12
  23. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    Texas
    Posts
    24
    Rep Power
    0
    Thanks - I'll try with 2.7.3.

    Sorry I can't rep you right now, but I am so new that I don't have those rights yet.

    I'll come back and rep you in a month.
  24. #13
  25. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    Texas
    Posts
    24
    Rep Power
    0
    Hmm, it looks pretty much the same on my computer for both versions. Actually 1 second faster on 2.6...

    Code:
    
    c:\Python27>python p133st.py
    Before creating string   :  13:21:07.63
    Before converting to int :  13:21:07.63
    Int converted            :  13:22:54.37
    
    c:\Python27>cd\python26
    
    c:\Python26>python p133st.py
    Before creating string   :  13:23:12.38
    Before converting to int :  13:23:12.38
    Int converted            :  13:24:58.05
    
    c:\Python26>
    
    I do have other things running on it, but it's an i7 with 8 cores, so ...
  26. #14
  27. Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Feb 2005
    Posts
    611
    Rep Power
    65
    Done on a CORE i5 machine:
    Code:
    '''mpm_timing2.py
    time a function with mpmath.timing()
    time int(s) where s is a string of 1 million digits
    
    downloaded Windows installer file (for Python27 to Python33)
    mpmath-0.17.win32.exe
    from
    http://code.google.com/p/mpmath/
    
    mpmath.timing(f, *args, **kwargs)
    Returns time elapsed for evaluating f() in seconds
    arguments can be passed to time the execution of f(*args, **kwargs)
    
    compares well with the Python module timeit
    '''
    
    import mpmath as mpm
    import sys
    
    print("Python version:\n %s\n" % sys.version)
    
    # create a numeric test string of one million sevens
    s = '7'*1000000
    
    # now time Python function int(s)
    print("int(s) = %f seconds" % mpm.timing(int, s))
    
    '''my result >>>
    Python version:
     2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)]
    int(s) = 117.061921 seconds
    
    Python version:
     3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
    int(s) = 113.484790 seconds
    
    Python version:
     3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32 bit (Intel)]
    int(s) = 31.143113 seconds
    '''
    You can see that Python as an interpreted language just comes out slow for your purposes. You might want to check out compiled languages like C or C++ for executable speed.
    Real Programmers always confuse Christmas and Halloween because Oct31 == Dec25
  28. #15
  29. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    Texas
    Posts
    24
    Rep Power
    0
    Thanks for your testing.

    FYI, I just added 3.3.0, and corrected the print syntax, and it now looks MUCH better.

    Code:
    c:\Python33>python p133st.py
    Before creating string   :  13:54:22.43
    Before converting to int :  13:54:22.43
    Int converted            :  13:54:32.21
    EDIT: Oh, and C/C++ does not hold integers that large.
Page 1 of 3 123 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo