聊聊我的源码阅读方法

本次代码阅览的项目来自 500lines 的子项目 web-server。 500 Lines or Less不仅是一个项目，也是一本同名书，有源码，也有文字介绍。这个项目由多个独立的章节组成，每个章节由范畴大牛试图用 500 行或许更少(500 or less)的代码，让读者了解一个功用或需求的简略完成。本文包括下面几个部分:

导读
项目结构介绍
简易HTTP服务
echo服务
文件服务
文件目录服务和cgi服务
服务重构
小结
小技巧

导读

咱们之前现已埋头阅览了十二个项意图源码，是时候空谈一下如何阅览源码了。

python项目许多，优异的也不少。学习这些项意图源码，能够让咱们更深化的理解API，了解项意图完成原理和细节。只是会用项目API，并不符合有进阶之心的你我。个人觉得看书，做题和重复照轮子，都不如源码阅览。咱们学习的进程，便是从模仿到创造的进程，看优异的源码，模仿它，然后超越它。

挑选合适项目也需求一定的技巧，这儿讲讲我的办法：

项目小巧一点，刚开始的时候功力有限，代码量小的项目，更容易读下去。初期阶段的项目，主张尽量在5000行以下。
项目纵向贯穿某个方向，逐步的打通整个链条。比方围绕http服务的不同阶段，咱们阅览了gunicorn，wsgi，http-server，bottle，mako。从服务到WSGI标准，从web框架到模版引擎。
项目横行能够比照，比方CLI部分，比照getopt和argparse；比方blinker和flask/django-signal的不同。

挑选好项目后，便是如何阅览源码了。咱们之前的代码阅览办法我称之为：概读法 。具体的讲便是根据项意图主要功用，仅剖析其核心完成，对于辅佐的功用，增强的功用能够暂时不必理会，避免堕入太多细节。简略举个例子: “研表究明，汉字的序顺并不定一影阅响读，比方你看完这句话后才发现这儿的字满是乱的”，咱们了解项目主要的功用，就能够开始到达意图。

哈哈，愚人节快乐

概读法，有一个弊端：咱们知道代码是这样完成的，但是无法解读为什么这样完成？所以是时候介绍一下另外一种代码阅览办法：前史比照法。前史比照法主要是比照代码的需求改变和版别前史，然后学习需求如何被完成。一般项目中，运用gitlog种的commit
-message来展现前史和需求。本篇的500lines-webserver项目中直接供给了演化示例，用来演示前史比照法再适合不过。

项目结构

本次代码阅览是用的版别是 fba689d1 , 项目目录结构如下表:

目录	描绘
00-hello-web	简易http服务
01-echo-request-info	能够显示恳求的http服务
02-serve-static	静态文件服务
03-handlers	支撑目录展现的http文件服务
04-cgi	cgi完成
05-refactored	重构http服务

简易HTTP服务

http服务十分简略，这样发动服务:

serverAddress = ('', 8080)
server = BaseHTTPServer.HTTPServer(serverAddress, RequestHandler)
server.serve_forever()

只呼应get恳求的Handler:

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
...
def do_GET(self):
self.send_response(200)
self.send_header("Content-type", "text/html")
self.send_header("Content-Length", str(len(self.Page)))
self.end_headers()
self.wfile.write(self.Page)

服务的效果，能够配合下面的恳求示例:

# curl -v http://127.0.0.1:8080
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/7.64.1
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.3 Python/2.7.16
< Date: Wed, 31 Mar 2021 11:57:03 GMT
< Content-type: text/html
< Content-Length: 49
<
<html>
<body>
<p>Hello, web!</p>
</body>
</html>
* Closing connection 0

本文不计划具体介绍http协议细节的完成，假如想了解http协议细节的请看第2篇博文，或许我之前的[python http 源码阅览]

echo服务

echo服务是在简易http服务上演进的，支撑对用户的恳求回声。所以咱们比照一下2个文件，就知道更改了哪些内容:

更改的重点在 do_GET 的完成，图片或许不太明晰，我把代码贴在下面:

# hello
def do_GET(self):
self.send_response(200)
...
self.wfile.write(self.Page)
# echo
def do_GET(self):
page = self.create_page()
self.send_page(page)

能够看到echo的 do_GET 中调用了 create_page 和 send_page 2个办法。短短两行代码，十分明晰的显示了echo和hello的差异。因为echo要获取客户端恳求并原样输出，固定的页面肯定部满意需求。需求先运用模版创立页面，再发送页面给用户。hello的 do_GET 办法的完成重构成send_page函数的主体，新增的create_page就十分简略：

def create_page(self):
values = {
'date_time'   : self.date_time_string(),
'client_host' : self.client_address[0],
'client_port' : self.client_address[1],
'command'     : self.command,
'path'        : self.path
}
page = self.Page.format(**values)
return page

单看echo的代码，会觉得平铺直叙。比照了hello和echo的差异，才能够感受到大师的手艺。代码展现了如何写出可读的代码和如何完成新增需求：

create-page和send-page函数名称明晰可读，能够断章取义。
create和send的逻辑天然平等。举个反例：更改成函数名称为create_page和_do_GET,功用不变，我们就会觉得别扭。
hello中的do_GET函数的5行完成代码彻底没变，只是重构成新的send_page函数。这样从测验角度，只需求对改变的部分(create_page)增加测验用例。

比照是用的命令是 vimdiff 00-hello-web/server.py 01-echo-request-info/server.py 也能够是用ide供给的比照东西。

文件服务

文件服务能够展现服务本地html页面:

 # Classify and handle request.
def do_GET(self):
try:
# Figure out what exactly is being requested.
full_path = os.getcwd() + self.path
# 文件不存在
if not os.path.exists(full_path):
raise ServerException("'{0}' not found".format(self.path))
# 处理html文件
elif os.path.isfile(full_path):
self.handle_file(full_path)
...
# 处理反常
except Exception as msg:
self.handle_error(msg)

文件和反常的处理:

def handle_file(self, full_path):
try:
with open(full_path, 'rb') as reader:
content = reader.read()
self.send_content(content)
except IOError as msg:
msg = "'{0}' cannot be read: {1}".format(self.path, msg)
self.handle_error(msg)
def handle_error(self, msg):
content = self.Error_Page.format(path=self.path, msg=msg)
self.send_content(content)

目录下还供给了一个status-code的版别，相同比照一下:

假如文件不存在，依照http协议标准，应该报404过错:

def handle_error(self, msg):
content = ...
self.send_content(content, 404)
def send_content(self, content, status=200):
self.send_response(status)
...

这儿利用了python函数参数支撑默许值的特性，让send_content函数稳定下来，即便后续有30x/50x过错，也不必修正send_content函数。

文件目录服务和CGI服务

文件服务需求升级支撑文件目录。一般假如一个目录下有index.html就展现该文件；没有该文件，就显示目录列表，便利运用者检查，不必手艺输入文件名称。

相同我把版别的迭代比照成下图，主要展现RequestHandler的改变:

do_GET要处理三种逻辑:html文件，目录和过错。假如持续用if-else办法就会让代码丑恶，也不易扩展，所以这儿运用战略形式进行了扩展:

# 有序的战略
Cases = [case_no_file(),
case_existing_file(),
case_always_fail()]
# Classify and handle request.
def do_GET(self):
try:
# Figure out what exactly is being requested.
self.full_path = os.getcwd() + self.path
# 挑选战略
for case in self.Cases:
if case.test(self):
case.act(self)
break
# Handle errors.
except Exception as msg:
self.handle_error(msg)

html，文件不存在和反常的3种战略完成:

class case_no_file(object):
'''File or directory does not exist.'''
def test(self, handler):
return not os.path.exists(handler.full_path)
def act(self, handler):
raise ServerException("'{0}' not found".format(handler.path))
class case_existing_file(object):
'''File exists.'''
def test(self, handler):
return os.path.isfile(handler.full_path)
def act(self, handler):
handler.handle_file(handler.full_path)
class case_always_fail(object):
'''Base case if nothing else worked.'''
def test(self, handler):
return True
def act(self, handler):
raise ServerException("Unknown object '{0}'".format(handler.path))

目录的完成就很简略了，再扩展一下 case_directory_index_file 和 case_directory_no_index_file 战略即可; cgi 的支撑也相同，增加一个 case_cgi_file 战略。

class case_directory_index_file(object):
...
class case_directory_no_index_file(object):
...
class case_cgi_file(object):
...

服务重构

完成功用后，作者对代码进行了一次重构:

重构后RequestHandler代码简洁了许多，只包括http协议细节的处理。handle_error处理反常，返回404过错；send_content生成http的呼应。

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
# Classify and handle request.
def do_GET(self):
try:
# Figure out what exactly is being requested.
self.full_path = os.getcwd() + self.path
# Figure out how to handle it.
for case in self.Cases:
if case.test(self):
case.act(self)
break
# Handle errors.
except Exception as msg:
self.handle_error(msg)
# Handle unknown objects.
def handle_error(self, msg):
content = self.Error_Page.format(path=self.path, msg=msg)
self.send_content(content, 404)
# Send actual content.
def send_content(self, content, status=200):
self.send_response(status)
self.send_header("Content-type", "text/html")
self.send_header("Content-Length", str(len(content)))
self.end_headers()
self.wfile.write(content)

恳求处理战略也进行了重构，构建了base_case父类，约好了处理的模版和步骤，而且默许供给了html文件的读取办法。

class base_case(object):
'''Parent for case handlers.'''
def handle_file(self, handler, full_path):
try:
with open(full_path, 'rb') as reader:
content = reader.read()
handler.send_content(content)
except IOError as msg:
msg = "'{0}' cannot be read: {1}".format(full_path, msg)
handler.handle_error(msg)
def index_path(self, handler):
return os.path.join(handler.full_path, 'index.html')
def test(self, handler):
assert False, 'Not implemented.'
def act(self, handler):
assert False, 'Not implemented.'

html文件的处理函数就很简略，完成判别函数和履行函数，其中履行函数还是还复用父类的html处理函数。

class case_existing_file(base_case):
'''File exists.'''
def test(self, handler):
return os.path.isfile(handler.full_path)
def act(self, handler):
self.handle_file(handler, handler.full_path)

战略最长便是不存在index.html页面的目录:

class case_directory_no_index_file(base_case):
'''Serve listing for a directory without an index.html page.'''
# How to display a directory listing.
Listing_Page = '''\
<html>
<body>
<ul>
{0}
</ul>
</body>
</html>
'''
def list_dir(self, handler, full_path):
try:
entries = os.listdir(full_path)
bullets = ['<li>{0}</li>'.format(e) for e in entries if not e.startswith('.')]
page = self.Listing_Page.format('\n'.join(bullets))
handler.send_content(page)
except OSError as msg:
msg = "'{0}' cannot be listed: {1}".format(self.path, msg)
handler.handle_error(msg)
def test(self, handler):
return os.path.isdir(handler.full_path) and \
not os.path.isfile(self.index_path(handler))
def act(self, handler):
self.list_dir(handler, handler.full_path)

list_dir动态生成一个文件目录列表的html文件。

小结

咱们一起运用前史比照法，阅览了500lines-webserver的代码演进进程，明晰的了解如何一步一步的完成一个文件目录服务。

RequestHandler的do_GET办法处理http恳求
运用send_content输出response，包括状况码，呼应头和body。
读取html文件展现html页面
展现目录
支撑cgi

在学习进程中，咱们还额定获得了如何扩充代码，编写可维护代码和重构代码示例，期望我们和我相同有所收成。

小技巧

前面介绍了，恳求的处理运用战略形式。能够先看看来自python-patterns项意图战略形式完成:

class Order:
def __init__(self, price, discount_strategy=None):
self.price = price
self.discount_strategy = discount_strategy
def price_after_discount(self):
if self.discount_strategy:
discount = self.discount_strategy(self)
else:
discount = 0
return self.price - discount
def __repr__(self):
fmt = "<Price: {}, price after discount: {}>"
return fmt.format(self.price, self.price_after_discount())
def ten_percent_discount(order):
return order.price * 0.10
def on_sale_discount(order):
return order.price * 0.25 + 20
def main():
"""
>>> Order(100)
<Price: 100, price after discount: 100>
>>> Order(100, discount_strategy=ten_percent_discount)
<Price: 100, price after discount: 90.0>
>>> Order(1000, discount_strategy=on_sale_discount)
<Price: 1000, price after discount: 730.0>
"""

ten_percent_discount供给9折，on_sale_discount供给75折再减20的优惠。不同的订单能够运用不同的扣头形式，比方示例调整成下面:

order_amount_list = [80, 100, 1000]
for amount in order_amount_list:
if amount < 100:
Order(amount)
break;
if amount < 1000:
Order(amount, discount_strategy=ten_percent_discount)
break;
Order(amount, discount_strategy=on_sale_discount)

对应的业务逻辑是:

订单金额小于100不打折
订单金额小于1000打9折
订单金额大于等于1000打75折并优惠20

假如咱们把打折的条件和扣头办法完成在一个类中，那就和web-server类似:

class case_discount(object):
def test(self, handler):
# 打折条件
...
def act(self, handler):
# 计算扣头
...

参考链接

github.com/aosabook/50…
github.com/HT524/500Li…
shuhari.dev/blog/2020/0…

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

聊聊我的源码阅读方法

导读

项目结构

简易HTTP服务

echo服务

文件服务

文件目录服务和CGI服务

服务重构

小结

小技巧

参考链接

评论(0)

提示：请文明发言取消回复

近期文章

近期评论

聊聊我的源码阅读方法

导读

项目结构

简易HTTP服务

echo服务

文件服务

文件目录服务和CGI服务

服务重构

小结

小技巧

参考链接

评论(0)

提示：请文明发言 取消回复

近期文章

近期评论

提示：请文明发言取消回复