mkdir sampleproject.git
cd sampleproject.git
git init --bare


unset GIT_DIR
DeployPath="/home/taidii/document/document/"
echo "==============================================="
cd $DeployPath
echo "deploying the document web"
git pull origin master
echo "================================================"

第二章: 字符串和文本

写在最前

### format的语法

replacement_field ::=  "{" [field_name] ["!" conversion] [":" format_spec] "}"
field_name ::= arg_name ("." attribute_name | "[" element_index "]")*
arg_name ::= [identifier | integer]
attribute_name ::= identifier
element_index ::= integer | index_string
index_string ::= <any source character except "]"> +
conversion ::= "r" | "s" | "a"
format_spec ::= <described in the next section>

format_spec ::= [[fill]align][sign][#][0][width][grouping_option][.precision][type]
fill ::= <any character>
align ::= "<" | ">" | "=" | "^"
sign ::= "+" | "-" | " "
width ::= integer
grouping_option ::= "_" | "," # 千位数的分隔符
precision ::= integer
type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"

2.15 字符串中插入变量

问题

想要创建一个内嵌变量的字符串,变量被用他的值所表示的字符串替换

解决方案

s = '{name} has {n} apples'
# 使用参数
s.format(name='link', n=10)
#'link has 10 apples'

# 使用mapping
s.format_map({'name': 'link', 'n': 10})
#'link has 10 apples'

# vars()
name = 'link'
n = 10
s.format_map(vars())
#'link has 10 apples'

vars() 还有一个有意思的特性就是它也适用于对象实例

class Info:
def __init__(self, name=None, n=None):
self.name = your
self.n =n

info = Info(name='link', n=10)
s.format_map(vars(info))
#'link has 10 apples'

但是这样还不能很好的处理参数的却是,所以我们还可以定义一个新的dict类

class SafeSub(dict):
def __missing__(self, key):
return '{' + key + '}'

info = Info(name='link')
s.format_map(SafeSub(vars()))
##'link has {n} apples'

import sys

def sub(text):
return text.format_map(safesub(sys._getframe(1).f_locals))
# sub() 函数使用 sys._getframe(1) 返回调用者的栈帧。可以从中访问属性 f_locals
# 来获得局部变量。 毫无疑问绝大部分情况下在代码中去直接操作栈帧应该是不推荐的。
# 但是,对于像字符串替换工具函数而言它是非常有用的。 另外,值得注意的是 f_locals
# 是一个复制调用函数的本地变量的字典。 尽管你可以改变 f_locals 的内容,
# 但是这个修改对于后面的变量访问没有任何影响。 所以,虽说访问一个栈帧看上去很邪恶,
# 但是对它的任何操作不会覆盖和改变调用者本地变量的值。

讨论

python 中对于字符串替换有许多其他的解决方案。例如:

name = 'Guido'
n = 37
'%(name)s %(n)d messages.' % vars()

import string
s = string.Template('$name has $n messages.')
s.substitute(vars())

但是 format 和 format_map 比这些方法都要先进。所以应该优先使用。而且还可以对齐,填充,格式化等.

2.16 以指定列宽格式化字符串

问题

你有一些长字符串,想以指定的列宽将它们重新格式化。

解决方案

使用 textwrap 模块来格式字符串输出,比如

s = "Look into my eyes, look into my eyes, the eyes, the eyes, \
the eyes, not around the eyes, don't look around the eyes, \
look into my eyes, you're under."

import textwrap
textwrap.fill(s, width=20)
# Look into my eyes,
# look into my eyes,
# the eyes, the eyes,
# the eyes, not around
# the eyes, don't look
# around the eyes,
# look into my eyes,
# you're under.

textwrap.fill(s, 40, initial_indent='****')
# ****Look into my eyes, look into my
# eyes, the eyes, the eyes, the eyes, not
# around the eyes, don't look around the
# eyes, look into my eyes, you're under.

textwrap.fill(s, 40, subsequent_indent='****')
# Look into my eyes, look into my eyes,
# ****the eyes, the eyes, the eyes, not
# ****around the eyes, don't look around
# ****the eyes, look into my eyes, you're
# ****under.

讨论

textwrap 对于字符串打印非常有用。特别是你想打印输出匹配终端大小的时候.你可以使用os.get_terminal_size()

import os
os.get_terminal_size()
# os.terminal_size(columns=76, lines=32)

2.17 在字符串中处理html和xml

问题

你想将HTML或者XML实体如 &entity; 或 &#code; 替换为对应的文本。 再者,你需要转换文本中特定的字符(比如<, >, 或 &)。

解决方案

如果你想替换文本字符串中的 ‘<’ 或者 ‘>’ ,使用 html.escape() 函数可以很容易的完成。比如:

s = 'Elements are written as "<tag>text</tag>".'
from html import escape, unescape
es = escape(s)
# 'Elements are written as &quot;&lt;tag&gt;text&lt;/tag&gt;&quot;.'

escape(s, quote=False) # 不转化 " 和 '
# 'Elements are written as "&lt;tag&gt;text&lt;/tag&gt;".'

unescape(es)
# 'Elements are written as "<tag>text</tag>".'

讨论

2.18 字符串令牌解析

2.19 实现一个简单的递归下降分析器

2.20 字节字符串上的字符串操作

Upload Python Package To PyPi or PyPiTest

~/.pypirc

[distutils]
index-servers=
pypi
testpypi

[testpypi]
repository: https://test.pypi.org/legacy/
username: your testpypi username
password: your testpypi password

twine upload --repository-url https://test.pypi.org/legacy/ dist/*
pip install --index-url https://test.pypi.org/simple/ your-package
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple your-package

twine upload --repository testpypi dist/*

#python setup.py register -r pypitest
python setup.py sdist upload -r pypitest
#python setup.py register -r pypi
python setup.py sdist upload -r pypi

Automatic Script Creation

https://setuptools.readthedocs.io/en/latest/setuptools.html#automatic-script-creation >Automatic Script Creation >Packaging and installing scripts can be a bit awkward with the distutils. For one thing, there’s no easy way to have a script’s filename match local conventions on both Windows and POSIX platforms. For another, you often have to create a separate file just for the “main” script, when your actual “main” is a function in a module somewhere. And even in Python 2.4, using the -m option only works for actual .py files that aren’t installed in a package. >setuptools fixes all of these problems by automatically generating scripts for you with the correct extension, and on Windows it will even create an .exe file so that users don’t have to change their PATHEXT settings. The way to use this feature is to define “entry points” in your setup script that indicate what function the generated script should import and run. For example, to create two console scripts called foo and bar, and a GUI script called baz, you might do something like this:

setup(
# other arguments here...
entry_points={
'console_scripts': [
'foo = my_package.some_module:main_func',
'bar = other_module:some_func',
],
'gui_scripts': [
'baz = my_package_gui:start_func',
]
})

Introduction

Convolutional networks are at the core of most stateof-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we are exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21.2% top-1 and 5.6% top-5 error for single frame evaluation using a network with a computational cost of 5 billion multiply-adds per inference and with using less than 25 million parameters. With an ensemble of 4 models and multi-crop evaluation, we report 3.5% top-5 error and 17.3% top-1 error

卷积神经网络是目前计算机解决多种多样任务的核心。自从2014年深度的卷积网络成为主流,产生了大量的不同的分支。尽管在大多数任务中增加的模型大小和计算耗费趋于能迅速得到高质量的回报,计算的效率和少量的参数依然能有效作用于很多场景。例如手机和大数据。这里我们探索一种方法能够有效率的计算的放大网络。通过分解卷积模型和积极的正则化。

Factorization convolution

example1 如图将一个55的卷积转化为2个33的卷积 则参数变为原来的 (33 + 33)/(5*5) = 72% 但是深度却增加了

#引用: * Rethinking the Inception Architecture for Computer Vision https://arxiv.org/pdf/1512.00567.pdf

Django练习

实现一个博客网站。

与当前浏览的网站类似,但也多一些附加功能 1. 登录登出功能 2. 登录后可以编辑博客的功能 3. 浏览博客的功能

GitFlow

1.概览

2.当前分支master

git pull --rebase origin master

3.第一次开始任务 TDP-xxx, 创建分支, “/”前面可用fix, feature, hotfix, 对应修改bug,新需求功能,紧急修复

git checkout -b feature/yourname_TDP-xxx

4.不是第一次开始任务,切换分支

git checkout feature/yourname_TDP-xxx

5.修改完, 提交修改

git add *
git commit -m'自己修改的内容的描述'
git push origin feature/yourname_TDP-xxx

6.自己本地测试完后线上测试,合并, merge, 切换到 dev分支, 并更新dev分支到最新

git checkout dev
git pull --rebase origin dev
git merge --no-ff feature/yourname_TDP-xxx

7.成功后

git push origin dev

8.在dev上查看自己修改是否成功, 如果没有问题,在gihub上创建pull request

一些python的练习

1. Add Digits

'''
Given a non-negative integer num, repeatedly add all its digits until the result has only one digit.

For example:

Given num = 38, the process is like: 3 + 8 = 11, 1 + 1 = 2. Since 2 has only one digit, return it.

Follow up:
Could you do it without any loop/recursion in O(1) runtime?
'''

class Solution(object):
def addDigits(self, num):
pass

My solution

class Solution(object):
def addDigits(self, num):
if num==0:
return 0
return num % 9 if num % 9 !=0 else 9

def addDigits(self, num):
if num==0:
return 0
return (num - 1) % 9 + 1

# 迭代
def addDigits(self, num):
return num if num < 10 else self.addDigits(reduce(lambad x,y: int(x) + int(y), list(str(num))))

2. array partition i

'''
Given an array of 2n integers, your task is to group these integers
into n pairs of integer, say (a1, b1), (a2, b2), ..., (an, bn)
which makes sum of min(ai, bi) for all i from 1 to n as large as possible.

Example 1:

Input: [1,4,3,2]

Output: 4
Explanation: n is 2, and the maximum sum of pairs is 4.

Note:

n is a positive integer, which is in the range of [1, 10000].
All the integers in the array will be in the range of [-10000, 10000].

'''
class Solution(object):
def arrayPairSum(self, nums):
"""
:type nums: List[int]
:rtype: int
"""
pass

My solution

class Solution(object):
def arrayPairSum(self, nums):
"""
:type nums: List[int]
:rtype: int
"""
return sum(sorted(nums)[::2])

3. detect capital

'''
Given a word, you need to judge whether the usage of capitals in it is right or not.

We define the usage of capitals in a word to be right when one of the following cases holds:

All letters in this word are capitals, like "USA".
All letters in this word are not capitals, like "leetcode".
Only the first letter in this word is capital if it has more than one letter, like "Google".

Otherwise, we define that this word doesn't use capitals in a right way.

Example 1:

Input: "USA"
Output: True

Example 2:

Input: "FlaG"
Output: False


'''

class Solution(object):
def detectCapitalUse(self, word):
"""
:type word: str
:rtype: bool
"""

My solution

class Solution(object):
def detectCapitalUse(self, word):
"""
:type word: str
:rtype: bool
"""
return word.isupper() or word.istitle()

4. distribute candies

"""
Given an integer array with even length, where different
numbers in this array represent different kinds of candies.
Each number means one candy of the corresponding kind.
You need to distribute these candies equally in number to
brother and sister. Return the maximum number of kinds of
candies the sister could gain.

Example 1:
Input: candies = [1,1,2,2,3,3]
Output: 3
Explanation:
There are three different kinds of candies (1, 2 and 3),
and two candies for each kind.
Optimal distribution: The sister has candies [1,2,3] and
the brother has candies [1,2,3], too.
The sister has three different kinds of candies.
Example 2:
Input: candies = [1,1,2,3]
Output: 2
Explanation: For example, the sister has candies [2,3] and
the brother has candies [1,1].
The sister has two different kinds of candies, the brother
has only one kind of candies.
Note:

The length of the given array is in range [2, 10,000], and
will be even.
The number in given array is in range [-100,000, 100,000].


class Solution(object):
def distributeCandies(self, candies):
"""
:type candies: List[int]
:rtype: int
"""

My solution

class Solution(object):
def distributeCandies(self, candies):
"""
:type candies: List[int]
:rtype: int
"""
return min(len(candies)//2, len(set(candies)))

5.two-sum-ii-input-array-is-sorted

'''
Given an array of integers that is already sorted in ascending order, find two numbers such that they add up to a specific target number.

The function twoSum should return indices of the two numbers such that they add up to the target, where index1 must be less than index2. Please note that your returned answers (both index1 and index2) are not zero-based.

You may assume that each input would have exactly one solution and you may not use the same element twice.

Input: numbers={2, 7, 11, 15}, target=9
Output: index1=1, index2=2
'''

class Solution(object):

def twoSum(self, numbers, target):
"""
:type numbers: List[int]
:type target: int
:rtype: List[int]
"""

6.single-number

'''
Given an array of integers, every element appears twice except for one. Find that single one.

Note:
Your algorithm should have a linear runtime complexity. Could you implement it without using extra memory?
'''
class Solution(object):
def singleNumber(self, nums):
"""
:type nums: List[int]
:rtype: int
"""

7.reverse-words-in-a-string-iii

"""
Given a string, you need to reverse the order
of characters in each word within a sentence while
still preserving whitespace and initial word order.

Example 1:
Input: "Let's take LeetCode contest"
Output: "s'teL ekat edoCteeL tsetnoc"
Note: In the string, each word is separated by
single space and there will not be any extra space in the string.
"""


class Solution(object):
def reverseWords(self, s):
"""
:type s: str
:rtype: str
"""

8.sum-of-left-leaves

'''
Find the sum of all left leaves in a given binary tree.

Example:

3
/ \
9 20
/ \
15 7

There are two left leaves in the binary tree, with values 9 and 15 respectively. Return 24.
'''

# Definition for a binary tree node.
#class TreeNode(object):
# def __init__(self, x, left=None, right=None):
# self.val = x
# self.left = left
# self.right = right

class Solution(object):
def sumOfLeftLeaves(self, root):
"""
:type root: TreeNode
:rtype: int
"""

9.relative-ranks

'''
Given scores of N athletes, find their relative ranks and the people with the top three highest scores, who will be awarded medals: "Gold Medal", "Silver Medal" and "Bronze Medal".

Example 1:

Input: [5, 4, 3, 2, 1]
Output: ["Gold Medal", "Silver Medal", "Bronze Medal", "4", "5"]
Explanation: The first three athletes got the top three highest scores, so they got "Gold Medal", "Silver Medal" and "Bronze Medal".
For the left two athletes, you just need to output their relative ranks according to their scores.

Note:

N is a positive integer and won't exceed 10,000.
All the scores of athletes are guaranteed to be unique.

'''


class Solution(object):
def findRelativeRanks(self, nums):
"""
:type nums: List[int]
:rtype: List[str]
"""

10.ransom-note

'''
Given an arbitrary ransom note string and another string containing letters
from all the magazines, write a function that will return true if the ransom
note can be constructed from the magazines ; otherwise, it will return false.

Each letter in the magazine string can only be used once in your ransom note.

Note:
You may assume that both strings contain only lowercase letters.

canConstruct("a", "b") -> false
canConstruct("aa", "ab") -> false
canConstruct("aa", "aab") -> true

'''
class Solution(object):
def canConstruct(self, ransomNote, magazine):
"""
:type ransomNote: str
:type magazine: str
:rtype: bool
"""

11.matrix

'''
Given a matrix consists of 0 and 1, find the distance of the nearest 0 for each cell.

The distance between two adjacent cells is 1.
Example 1:
Input:

0 0 0
0 1 0
0 0 0
Output:
0 0 0
0 1 0
0 0 0
Example 2:
Input:

0 0 0
0 1 0
1 1 1
Output:
0 0 0
0 1 0
1 2 1
Note:
The number of elements of the given matrix will not exceed 10,000.
There are at least one 0 in the given matrix.
The cells are adjacent in only four directions: up, down, left and right.

'''

class Solution(object):
def updateMatrix(self, matrix):
"""
:type matrix: List[List[int]]
:rtype: List[List[int]]
"""

12.1-bit-and-2-bit-characters

'''
We have two special characters. The first character can be represented by one bit 0. The second character can be represented by two bits (10 or 11).

Now given a string represented by several bits. Return whether the last character must be a one-bit character or not. The given string will always end with a zero.

Example 1:
Input:
bits = [1, 0, 0]
Output: True
Explanation:
The only way to decode it is two-bit character and one-bit character. So the last character is one-bit character.
Example 2:
Input:
bits = [1, 1, 1, 0]
Output: False
Explanation:
The only way to decode it is two-bit character and two-bit character. So the last character is NOT one-bit character.
Note:

1 <= len(bits) <= 1000.
bits[i] is always 0 or 1.
'''


class Solution(object):
def isOneBitCharacter(self, bits):
"""
:type bits: List[int]
:rtype: bool
"""

13.132 Pattern

Given a sequence of n integers a1, a2, ..., an, a 132 pattern is a subsequence ai, aj, ak such that i < j < k and ai < ak < aj. Design an algorithm that takes a list of n numbers as input and checks whether there is a 132 pattern in the list.

Note: n will be less than 15,000.

Example 1:
Input: [1, 2, 3, 4]

Output: False

Explanation: There is no 132 pattern in the sequence.
Example 2:
Input: [3, 1, 4, 2]

Output: True

Explanation: There is a 132 pattern in the sequence: [1, 4, 2].
Example 3:
Input: [-1, 3, 2, 0]

Output: True

Explanation: There are three 132 patterns in the sequence: [-1, 3, 2], [-1, 3, 0] and [-1, 2, 0].

class Solution(object):
def find132pattern(self, nums):
"""
:type nums: List[int]
:rtype: bool
"""

方差,协方差,协方差矩阵, 去中心化, whiting, dewhiting

参考:http://blog.csdn.net/kuang_liu/article/details/16369475 http://blog.csdn.net/beechina/article/details/51074750

平均值

描述样本的中点

方差

描述样本的集中程度,样本集合各个样本点到中点的距离的平均值得平方

协方差(convariance)

各个维度的参数之间的相关性. 正:正相关,负:负相关,0:相对独立

协方差矩阵(convariance matrix)

多维度的协方差

计算方法: * 1.各个维度去中心化,即减去各维度的平均值,使各个维度的平均值都为0, 得到矩阵 X * 2. Cov = X * X.T / (m - 1)

restricted Boltzmann machine, RBM

配分函数, 条件独立

对比分歧(contrastive divergence,CD)算法

Deep belief network

CIFAR-10

1.about CIFAR-10

用于训练分类的数据库,一共10种分类。 地址:http://www.cs.toronto.edu/~kriz/cifar.html

2.模型

CIFAR-10

如图所示,当前使用的结构为
(C + MP) * 2 + F*3
> (C: convolutional卷积层, MP: maxpooling, F: full connect)

3.结果

  • 2017-09-17 Accuracy : 62% 用时4h
  • To be continued

Q-Learning

1.主题思想

  • 1.创建一个Q表。存放所有环境状态(S), 行动(A), 奖励(R), 后续可以根据环境查表。获取奖励最大的行动
  • 2.初始化环境
  • 3.获取环境的状态 observation (s)
  • 4.查Q表获得行动 action (a)
  • 5.用action 去更新环境, 获得新的状态 observation_(s_) 和 奖励 reward (r) .得 Q现实 = reward + gamma * Q_max(observation_, action_)。 其中Q_max(observation_, action_)为查Q表获得的下一步最大的奖励的值
  • 6.用action 去查询Q表。获得 Q估计 = Q(s, a)
  • 7.error = Q现实 - Q估计
  • 8.利用梯度下降或者其他activation不断学习,降低error。 Q(s, a) = Q(s, a) + alpha * error

2.算法

Q-Learning

Sarsa

1.主题思想

* 与Q-Learning类似。但是第5步中获取 Q现实不同。
  Q-learning是查询Q表获取最大的奖励reward
  sarsa是将环境代入Q表和算法中获取下一步的行动action_和奖励 Q(observation_, action_)

2.算法

Sarsa

Sarsa(lambda)

1.算法

Sarsa

DQN

1.主题思想

  • 1.与Q-learning类似,但是Q-learning方法对于状态连续的,状态数量无限多的情况不适用。所以使用神经网络代替Q表。 类似于用一个复杂的方程代替Q表。输入环境状态 s 和 行动 a 获得 对应的奖励reward
  • 2.用2个Neural Network.一个(N_eval)用于Q现实。一个(N_target)用于Q估计。
  • 3.N_target的参数快于N_eval. N_eval用于计算Q_现实 N_target不断训练。与N_eval的值对比。一定步数之后将N_eval的参数替换为N_target的。

2.算法

Sarsa

0. design model

设计模型: lay_in, lay_hidden, lay_out design model

1. randomly initalize theta

initalize each theta(l) to a random value [-e, e] randomly initalize theta

2. forward propagation

2.1. compute a(l)

计算 layer l 上的 a(l) compute a(l)

2.2. compute cost J(theta)

compute cost J(theta) # 3. back propagation # 3.1. compute delta compute delta

3.2. compute Delta

compute Delta

3.3. compute partial derivatives

compute partial derivatives
compute partial derivatives

4. add regularizaton

5. gradient checking

利用一个小规模的模型,来验证代码的正确性 gradient checking

6. use gradient descent or advanced optimization method try to minimize J(theta)

使用梯度下降或者高级方式迭代(如:fmincg), 最小化代价 J(theta)

options = optimset('MaxIter', 50);
costFunction = @(p) nnCostFunction(p, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, X, y, lambda);
[nn_params, cost] = fmincg(costFunction, initial_nn_params, options);

P.S.

写公式的方法
<img src="http://chart.googleapis.com/chart?cht=tx&chl= 在此插入Latex公式" style="border:none;">
例如:
<img src="http://chart.googleapis.com/chart?cht=tx&chl=\Large x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}" style="border:none;">

效果

0%