#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Location
    America (But which one?)
    Posts
    43
    Rep Power
    1

    Phrase Dictionary utilizing another Dictionary


    I have been looking for a good Python script to create a dictionary of any size I want, and I found a really great one here on Devshed (with the code below) provided by Lucantrop.

    I was looking to tweak it a bit, but don't even know where to start!

    I would like to make this script read a separate dictionary text file, and use the provided phrases in this dictionary, at random, as a basis for creating new phrases.

    For example, I have a text file with only three phrases in it, 'Doctor' 'Zeigler' and '08012013'

    If my script tells the dictionary to utilize all letters, numbers, and the symbols above the numbers (just like in Lucantrop's provided script), with phrases being between 10 -12 characters long, it would start producing results such as this:

    Zeiglerh57
    45Doctorw6y
    he08012013lo
    tyZeigler87
    ..........

    If there is anyway this could be done, it would be ridiculously amazing!!!

    Here is the original script as provided by Lucantrop in the original post:
    Code:
    import random
    
    out = open('pwds_test.txt', 'w')
    
    abc = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890!@#$%^&*()'
    
    pwd = ''
    for i in range(0, 10):
        for j in random.sample(abc, random.randint(4, 25)):
            pwd += j
        out.write(pwd+'\n')
        pwd = ''
    I look forward to any responses!
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,711
    Rep Power
    480
    Code:
    '''
        This problem fails to interest me.
    '''
    
    import os, sys, random
    
    JUNK = []
    append = JUNK.append
    for i in range(128 or 256):
        c = '{:c}'.format(i)
        if c.isprintable():
            append(c)
    
    #print(JUNK)  # 128 is the better choice in English.
    
    dumb = lambda a: random.choice(list(range(a+1)))
    
    def embed_in_junk(WORD, SHORT, LONG, JUNK = JUNK):
        if (SHORT < 0) or (LONG < SHORT):
            raise ValueError('cmon dude')
        if LONG < len(WORD):
            raise ValueError('too dang long')
        EXTRA = dumb(LONG - SHORT) + (SHORT - len(WORD))
        PARC = ''.join(random.choice(JUNK) for i in range(EXTRA))
        SPLIT = dumb(len(PARC))
        return '{}{}{}'.format(PARC[:SPLIT], WORD, PARC[SPLIT:])
    
    if __name__ == '__main__':
        MX = 10
        with open('{}/Downloads/books/unixdict.txt'.format(os.environ['HOME'])) as inf:
            MANY_WORDS = inf.readlines()
        for W in (random.choice(MANY_WORDS) for i in range(30)):
            WORD = W.strip()
            try:
                print(embed_in_junk(WORD, 7, MX))
            except ValueError:
                print(WORD, len(WORD), 'better exceed {}!'.format(MX))
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Location
    America (But which one?)
    Posts
    43
    Rep Power
    1
    I'm sorry this fails to interest you, but thank you very much for replying!

    If you care to offer any more help, I don't fully understand the script (I like to understand what I'm using).

    Code:
    JUNK = []
    append = JUNK.append
    for i in range(128 or 256):
        c = '{:c}'.format(i)
        if c.isprintable():
            append(c)
    
    #print(JUNK)  # 128 is the better choice in English.
    What? I don't understand the whole 128 and 256 thing . . .

    And where do I choose the alphabet that it is going to use? I really hope you can help me :-/
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,711
    Rep Power
    480
    But more importantly, does this program do what you need?


    The JUNK input to embed_in_junk is an object supporting the __getitem__ method. In other words, you can index it. A list, tuple, or string all work. I generated it as a list.
    Instead of haphazardly typing all the keys on my keyboard, rather than importing ascii_uppercase etceteras from string, unwilling to copy the string you copied, I generated a list of JUNK. The printable ASCII characters less than 128 are useful in my locale. You might be German and would like to use a schluss S or vowels with umlauts. Printable ASCII characters
    Code:
    32 < >	33 <!>	34 <">	35 <#>	36 <$>	37 <%>	38 <&>	39 <'>	
    40 <(>	41 <)>	42 <*>	43 <+>	44 <,>	45 <->	46 <.>	47 </>	
    48 <0>	49 <1>	50 <2>	51 <3>	52 <4>	53 <5>	54 <6>	55 <7>	
    56 <8>	57 <9>	58 <:>	59 <;>	60 <<>	61 <=>	62 <>>	63 <?>	
    64 <@>	65 <A>	66 <B>	67 <C>	68 <D>	69 <E>	70 <F>	71 <G>	
    ...
    88 <X>	89 <Y>	90 <Z>	91 <[>	92 <\>	93 <]>	94 <^>	95 <_>	
    96 <`>	97 <a>	98 <b>	99 <c>	100 <d>	101 <e>	102 <f>	103 <g>	
    ...
    120 <x>	121 <y>	122 <z>	123 <{>	124 <|>	125 <}>	126 <~>	161 <>	
    162 <>	163 <>	164 <>	165 <>	166 <>	167 <>	168 <>	169 <>	
    170 <>	171 <>	172 <>	174 <>	175 <>	176 <>	177 <>	178 <>	
    179 <>	180 <>	181 <>	182 <>	183 <>	184 <>	185 <>	186 <>	
    187 <>	188 <>	189 <>	190 <>	191 <>	192 <>	193 <>	194 <>	
    195 <>	196 <>	197 <>	198 <>	199 <>	200 <>	201 <>	202 <>	
    203 <>	204 <>	205 <>	206 <>	207 <>	208 <>	209 <>	210 <>	
    211 <>	212 <>	213 <>	214 <>	215 <>	216 <>	217 <>	218 <>	
    219 <>	220 <>	221 <>	222 <>	223 <>	224 <>	225 <>	226 <>	
    227 <>	228 <>	229 <>	230 <>	231 <>	232 <>	233 <>	234 <>	
    235 <>	236 <>	237 <>	238 <>	239 <>	240 <>	241 <>	242 <>	
    243 <>	244 <>	245 <>	246 <>	247 <>	248 <>	249 <>	250 <>	
    251 <>	252 <>	253 <>	254 <>	255 <>
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Location
    America (But which one?)
    Posts
    43
    Rep Power
    1
    Ahh!! Okay, I get that now! I will hopefully be able to test this out later this evening!

    Thank you very much for your reply!
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,711
    Rep Power
    480

    Often I wonder about myself....


    For whatever reason instead of writing comments

    # 128 or 256 are reasonable choices

    I've instead taken to expressions, such as

    128 or 256

    in which I'd just change the "or" to "and" to get the other value.
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo