It’s Time To Learn PYTHON! (record of python challenges)

Sometimes refer to the solutions to all the passes: HackingNote Python Challenges Solutions

The First Pass

The gibberish displayed on the screen is a passcode which is coded with Caeser cipeher, one of the simplest encryption techniques. From the Wikipedia, we can know that

It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of 3, D would be replaced by A, E would become B, and so on.

So I first wrote a code like this:

1
2
3
4
5
para = input ()
for i in para:
if ord (i) >= ord ("a") and ord (i) <= ord ("z"):
print ((chr)((ord (i) + 2 - ord ("a")) % 26 + ord ("a")), end = "")
else: print (i, end = "")

Apparently, it is quite cumbersome. After decoding, we can learn from the text that $string.maketrans ()$ is recommanded to be used when facing such problem. So I rewrote the code:

1
2
3
4
5
6
intext = "abcdefghijklmnopqrstuvwxyz"
outext = "cdefghijklmnopqrstuvwxyzab"
trans = str.maketrans (intext, outext);

para = input ()
print (para.translate (trans))

And the code is simpler now. Follow the tips translated, you only need to apply the encryption technique into the url “map” so that we can get “ocr”. After replacing “map” with “ocr”, we can get to the second pass.

The Second Pass

recognize the characters. maybe they are in the book, but MAYBE they are in the page source.

It hints us the solution hides in the page source. After opening the page source, we can see another hint and a long string containing various notations, that is

psource.png

It hints us to find the characters. But how can we get the content of page source? There is a package named $\text{urllib}$, where we can import function urllib.request.urlopen to load raw html data. Then we use function read and decodeto decode the html and use regular expression to extract the comment blocks.

Now we have transformed the comment blocks into a string without \n, so the solution is obvious now. We apply re.findall to the string again then we can find the answer $\text{equality}$

The code is shown below

1
2
3
4
5
6
from urllib.request import urlopen
import re
myURL = urlopen ("http://www.pythonchallenge.com/pc/def/ocr.html")
url = myURL.read().decode()
readl = re.findall ("<!--\s%[\s\S]*-->", url)
print ("".join (re.findall ("[a-zA-Z]", readl[0])))

The Third Pass

Nearly the same as the second pass.

1
2
3
4
5
6
from urllib.request import urlopen
import re
myURL = urlopen ("http://www.pythonchallenge.com/pc/def/equality.html")
url = myURL.read().decode()
readl = re.findall ("<!--[\s\S]*-->", url)
print ("".join (re.findall ("[^A-Z][A-Z]{3}([a-z])[A-Z]{3}[^A-Z]", readl[0])))

The Forth Pass

From this pass I learn that we can use notation as %s to temporarily substitude a variable, and when you want to call the variable, you only need to add a % notation between two variables.

1
2
3
4
5
6
7
8
9
10
11
12
from urllib.request import urlopen
import re
findx = ['a']
web = "http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=%s"
# add = "12345"
add = "8022"
while findx:
myURL = urlopen (web % add)
url = myURL.read().decode()
findx = re.findall ("and the next nothing is (.*)", url)
if findx: add = findx[0]
print (add, url)

The Fifth Pass

According to the page source, we replace the peak.html with banner.p and enter a page full of notations. But we find it difficult to recongnize the characters after decoding the notations in utf-8. In the meantime, we can see a hint in the original page, that is, $\text{pronounce it}$. The pronunciation of $\text{peak}$ is like a object serialization module in Python —— $\text{pickle}$. It hints that we need to decode in the form of pickle, namely,

1
2
myURL = urlopen ("http://www.pythonchallenge.com/pc/def/banner.p")
readl = pickle.load (myURL)

Then it is easy to get to a banner like this:

@2x.png

The problem is resolved right now. The full code is shown below:

1
2
3
4
5
6
7
8
9
10
from urllib.request import urlopen
import pickle
myURL = urlopen ("http://www.pythonchallenge.com/pc/def/banner.p")
readl = pickle.load (myURL)
f = open ("log.txt", "w")
for lst in readl:
for tup in lst:
f.write (tup[0] * tup[1])
f.write ("\n")
f.close ()

The Sixth Pass

Nearly the same as the previous passes, though the information in page sources is shifted to the zip. We only need to replace the module of urllib with zipfile

1
2
3
4
5
6
7
8
9
10
import zipfile, re
f = zipfile.ZipFile ("channel.zip") # open the zip file
add = '90052'; findx = ['']; comment = ''
while findx:
readl = f.read(add + '.txt').decode() /# before being decoded, the data is in binary

findx = re.findall ("Next nothing is (.*)", readl)
comment += f.getinfo (add + '.txt').comment.decode()
if findx: add = findx[0]
print (comment)

The answer is oxygen

The Seventh Pass

In this pass, we only have an image and do not know what to do. But there is a grey bar in the middle of the image, which may be the key. We use urlopen to load the image as binary, and use BytesIO to decode, then use the open function in Image module to obtain the data of the image.

1
img = Image.open (BytesIO (urlopen ("http://www.pythonchallenge.com/pc/def/oxygen.png").read()))

After using the getpixel to get the data of certain pixel, we get a list of (R, G, B, alpha). Then merge the seven same tuples, and get the representation in $ASCII$ of the integer of (R, G, B).

1
2
lst = [img.getpixel ((x, img.height / 2)) for x in range (img.width)][::7]
lst = [R for R, G, B, alpha in lst]

Print the result, we now have another hint:

smart guy, you made it. the next level is $[105, 110, 116, 101, 103, 114, 105, 116, 121]$pe_

Following the hint, we do the same process again such that we can get the answer.

$integrity$

1
2
3
4
5
6
7
8
from urllib.request import urlopen
from io import BytesIO
from PIL import Image
# img = Image.open (BytesIO (requests.get ("http://www.pythonchallenge.com/pc/def/oxygen.png").content))
img = Image.open (BytesIO (urlopen ("http://www.pythonchallenge.com/pc/def/oxygen.png").read()))
lst = [img.getpixel ((x, img.height / 2)) for x in range (img.width)][::7]
lst = [R for R, G, B, alpha in lst]
print ("".join (map (chr, lst)))

The Eighth Pass

Decode the username and password in the page source in the form of bzip2

1
2
3
4
import bz2
un = bz2.decompress (b'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084')
pw = bz2.decompress (b'BZh91AY&SY\x94$|\x0e\x00\x00\x00\x81\x00\x03$ \x00!\x9ah3M\x13<]\xc9\x14\xe1BBP\x91\xf08')
print (un, pw)

Or you even can find the answer in the page source…

The Ninth Pass

The number datas in the page source are the coordinates of some dots, we need to use the ImageDraw module in PIL to draw.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
first = [146,399,163,403,170,393,169,391,166,386,170,381,170,371,170,355,169,346,167,335,170,329,170,320,170,
310,171,301,173,290,178,289,182,287,188,286,190,286,192,291,194,296,195,305,194,307,191,312,190,316,
190,321,192,331,193,338,196,341,197,346,199,352,198,360,197,366,197,373,196,380,197,383,196,387,192,
389,191,392,190,396,189,400,194,401,201,402,208,403,213,402,216,401,219,397,219,393,216,390,215,385,
215,379,213,373,213,365,212,360,210,353,210,347,212,338,213,329,214,319,215,311,215,306,216,296,218,
290,221,283,225,282,233,284,238,287,243,290,250,291,255,294,261,293,265,291,271,291,273,289,278,287,
279,285,281,280,284,278,284,276,287,277,289,283,291,286,294,291,296,295,299,300,301,304,304,320,305,
327,306,332,307,341,306,349,303,354,301,364,301,371,297,375,292,384,291,386,302,393,324,391,333,387,
328,375,329,367,329,353,330,341,331,328,336,319,338,310,341,304,341,285,341,278,343,269,344,262,346,
259,346,251,349,259,349,264,349,273,349,280,349,288,349,295,349,298,354,293,356,286,354,279,352,268,
352,257,351,249,350,234,351,211,352,197,354,185,353,171,351,154,348,147,342,137,339,132,330,122,327,
120,314,116,304,117,293,118,284,118,281,122,275,128,265,129,257,131,244,133,239,134,228,136,221,137,
214,138,209,135,201,132,192,130,184,131,175,129,170,131,159,134,157,134,160,130,170,125,176,114,176,
102,173,103,172,108,171,111,163,115,156,116,149,117,142,116,136,115,129,115,124,115,120,115,115,117,
113,120,109,122,102,122,100,121,95,121,89,115,87,110,82,109,84,118,89,123,93,129,100,130,108,132,110,
133,110,136,107,138,105,140,95,138,86,141,79,149,77,155,81,162,90,165,97,167,99,171,109,171,107,161,
111,156,113,170,115,185,118,208,117,223,121,239,128,251,133,259,136,266,139,276,143,290,148,310,151,
332,155,348,156,353,153,366,149,379,147,394,146,399]
second = [156,141,165,135,169,131,176,130,187,134,191,140,191,146,186,150,179,155,175,157,168,157,163,157,159,
157,158,164,159,175,159,181,157,191,154,197,153,205,153,210,152,212,147,215,146,218,143,220,132,220,
125,217,119,209,116,196,115,185,114,172,114,167,112,161,109,165,107,170,99,171,97,167,89,164,81,162,
77,155,81,148,87,140,96,138,105,141,110,136,111,126,113,129,118,117,128,114,137,115,146,114,155,115,
158,121,157,128,156,134,157,136,156,136]

from PIL import Image, ImageDraw
im = Image.new ('RGB', (500, 500)) # create a new image
draw = ImageDraw.Draw (im) # create a object
draw.polygon (first, fill = 'white') # create a polygon, which is filled in white
draw.polygon (second, fill = 'white')
im.show ()

The Tenth Pass

The sequence you get by clicking the bull is called $\text{look-and-say sequence}$. From $Google$, we can see that

To generate a member of the sequence from the previous member, read off the digits of the previous member, counting the number of digits in groups of the same digit.

It is easy to use regular expression in $\text{Python}$ to obtain the target string.

However, I cannot understand I am supposed to regard my regular expression string as raw string, instead of using it to find the characters straightly.

1
2
3
4
5
6
import re
strl = "1"
for p in range (30):
strl = "".join ([str (len (i + j)) + i for i, j in re.findall ("(\d)(\\1*)", strl)])
# Why should I place another escape character ahead of '\1'??? Why????????
print (len (strl))