CN104376098A - File batch validation method based on python - Google Patents

File batch validation method based on python Download PDF

Info

Publication number
CN104376098A
CN104376098A CN201410684364.6A CN201410684364A CN104376098A CN 104376098 A CN104376098 A CN 104376098A CN 201410684364 A CN201410684364 A CN 201410684364A CN 104376098 A CN104376098 A CN 104376098A
Authority
CN
China
Prior art keywords
file
function
python
files
create
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410684364.6A
Other languages
Chinese (zh)
Other versions
CN104376098B (en
Inventor
孙志云
宗栋瑞
吴楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410684364.6A priority Critical patent/CN104376098B/en
Publication of CN104376098A publication Critical patent/CN104376098A/en
Application granted granted Critical
Publication of CN104376098B publication Critical patent/CN104376098B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots

Abstract

The invention discloses a file batch validation method based on the python, and belongs to a file batch validation method. The method includes the following steps that (1) the environment using python2.6 above is set up; (2) python files are compiled, and python functions such as, md5sum, find_files, generate_md5, read_md5 and check_md5 are created; (3) md5list files are generated by calling the generate_md5 function; (4) file batch validation is conducted by calling the check_md5 function. According to the file batch validation method based on the python, a simple and flexible method is adopted to finish batch validation of the files, a user can conduct batch validation of the files quickly and easily, and efficiency is improved.

Description

A kind of files in batch method of calibration based on python
Technical field
The present invention relates to a kind of files in batch method of calibration, specifically a kind of files in batch method of calibration based on python.
Background technology
File often can run into various situation in transmitting procedure, file corruption, and file is blocked and carries out malicious modification or replacement, and file content is acquired, and above all situations make the normal use of file be subject to extreme influence with safety.
The full name of MD5 is Message-Digest Algorithm 5, is invented in the early 1990s, develop through MD2, MD3 and MD4 by the computer science laboratory of MIT and RSA Data Security Inc." byte serial " of random length is transformed into the big integer of a 128bit by MD5, and it is an irreversible character string mapping algorithm.The typical apply of MD5 produces fingerprint (fingerprint), to prevent " being distorted " to one section of Message (byte serial).
Python is a kind of object-oriented, explanation type computer programming language, is invented the end of the year 1989 by Guido van Rossum, and first public publication version was issued in 1991.Python grammer is succinct and clear, has abundant and powerful class libraries.
Summary of the invention
Technical assignment of the present invention be to provide a kind ofly to adopt simply, method completes file flexibly batch verification, user can fast, simply carry out the batch verification of file, a kind of files in batch method of calibration based on python of raising the efficiency.
Technical assignment of the present invention realizes in the following manner:
Based on a files in batch method of calibration of python, comprise the steps:
(1) more than python2.6 environment, is built;
(2), write python file, create python function md5sum, find_files, generate_md5, read_md5, check_md5;
(3), md5list file is generated by calling generate_md5 function;
(4) the batch verification of file, is carried out by calling check_md5 function.
In step (2), carry out iteration reading in md5sum function with scale-of-two reading mode to file, each reading size is 8k.
In step (3), the md5 value that md5list file include file path is corresponding with it.
In step (4), whether whether check_md5 function exists according to the file under md5list file checking current path mates with md5; Check_md5 function exports unmatched file name and returns results list.
Based on a files in batch method of calibration of python, detailed step is as follows:
1. the environment of more than python2.6, is built;
2. the catalogue of required verification file, is entered;
3., create a python function md5sum, its function is as follows: the example m creating a hashlib.md5; Create a function read_block, act as and open specified file with scale-of-two reading mode, from the initial position of file, reading size is successively the content of 8k, and uses the update method of m to upgrade md5 value; Return 32 16 binary digits of m;
4., create a python function f ind_files, its function is as follows: the absolute path of files all under obtaining specified path;
5., create a python function generate_md5, its function is as follows: recursive call step 3. with step 4., under current directory, generate the file of a md5list, the md5 value that include file path is corresponding with it;
6., create a python function read_md5, its function is as follows: read the content in md5list file, save as a dictionary and return according to the form of { filename: md5 };
7., create a python function check_md5, its function is as follows: invocation step 5. in read_md5 function, obtain file original md5 value; The absolute path of files all under obtaining specified path; Recursive call step 3. in md5sum method, obtain the md5 value of All Files, and compare in original md5 value, the filename that Output rusults is different also returns.
A kind of files in batch method of calibration based on python of the present invention has the following advantages: implementation method is relatively simple, process is clear, and verification efficiency is higher, and the verification of large files there will not be stuck phenomenon, code structure is bright and clear, is easy to developer's exploitation, debugging and safeguards.
Accompanying drawing explanation
Below in conjunction with accompanying drawing, the present invention is further described.
Accompanying drawing 1 is a kind of process flow diagram of the files in batch method of calibration based on python.
Embodiment
With reference to Figure of description and specific embodiment, a kind of files in batch method of calibration based on python of the present invention is described in detail below.
Embodiment 1:
A kind of files in batch method of calibration based on python of the present invention, comprises the steps:
(1) more than python2.6 environment, is built;
(2), write python file, create python function md5sum, find_files, generate_md5, read_md5, check_md5;
(3), md5list file is generated by calling generate_md5 function;
(4) the batch verification of file, is carried out by calling check_md5 function.
In step (2), carry out iteration reading in md5sum function with scale-of-two reading mode to file, each reading size is 8k.
In step (3), the md5 value that md5list file include file path is corresponding with it.
In step (4), whether whether check_md5 function exists according to the file under md5list file checking current path mates with md5; Check_md5 function exports unmatched file name and returns results list.
Embodiment 2:
A kind of files in batch method of calibration based on python of the present invention, detailed step is as follows:
1. the environment of more than python2.6, is built;
2. the catalogue of required verification file, is entered;
3., create a python function md5sum, its function is as follows: the example m creating a hashlib.md5; Create a function read_block, act as and open specified file with scale-of-two reading mode, from the initial position of file, reading size is successively the content of 8k, and uses the update method of m to upgrade md5 value; Return 32 16 binary digits of m;
4., create a python function f ind_files, its function is as follows: the absolute path of files all under obtaining specified path;
5., create a python function generate_md5, its function is as follows: recursive call step 3. with step 4., under current directory, generate the file of a md5list, the md5 value that include file path is corresponding with it;
6., create a python function read_md5, its function is as follows: read the content in md5list file, save as a dictionary and return according to the form of { filename: md5 };
7., create a python function check_md5, its function is as follows: invocation step 5. in read_md5 function, obtain file original md5 value; The absolute path of files all under obtaining specified path; Recursive call step 3. in md5sum method, obtain the md5 value of All Files, and compare in original md5 value, the filename that Output rusults is different also returns.
Embodiment 3:
A kind of files in batch method of calibration based on python of the present invention, following environment:
Hardware: x86 framework pc;
Software: centos 6.3_x86_64.
Concrete implementation step is as follows:
1, download and adopt default setting that python-2.7.8 is installed;
2, enter/home/test file, create a python file and be called file_md5_check.py, content is as follows:
import hashlib
import os,sys,time
md5list=os.getcwd()+os.sep+"md5list"
def md5sum(file):
m=hashlib.md5()
def read_block():
with open(file,"rb") as f:
f.seek(0)
while True:
block=f.read(8096)
if block:
yield block
else:
return
for b in read_block():
m.update(b)
return m.hexdigest()
def find_files(path):
for path,dir,files in os.walk(path):
for file in files :
if file != "md5list" and file != __file__:
file_path=os.path.join(path,file)
yield file_path
def generate_md5(path):
if os.path.isfile(md5list):
os.rename(md5list,md5list+str(time.time()))
f=open(md5list,"w+")
for file in find_files(path):
f.write(file+" "+md5sum(file)+"\n")
f.close()
def read_md5():
files_md5={}
with open(os.getcwd()+os.sep+"md5list",'r') as f:
lines=f.readlines()
for line in lines:
files_md5[line.split()[0]]=line.split()[1]
return files_md5
def check_md5():
ori_files_md5_dic=read_md5()
not_matched_file=[]
for file in ori_files_md5_dic.keys():
if not os.path.isfile(file):
not_matched_file.append(file)
else:
if ori_files_md5_dic[file]!=md5sum(file):
not_matched_file.append(file)
if not_matched_file!=[]:
for i in not_matched_file:
print i+" is not matched\n"
else:
print "all the file is matched"
return not_matched_file;
3, call generate_md5 method, parameter is "/home/test ", under test file, generate md5list file, comprise/home/test under the md5 information of All Files;
4, call check_md5 method, whether the lower file of verification current file folder mates with md5list, prints not matching files title.
By embodiment above, described those skilled in the art can be easy to realize the present invention.But should be appreciated that the present invention is not limited to 3 kinds of above-mentioned embodiments.On the basis of disclosed embodiment, described those skilled in the art can the different technical characteristic of combination in any, thus realizes different technical schemes.

Claims (5)

1., based on a files in batch method of calibration of python, it is characterized in that comprising the steps:
(1) more than python2.6 environment, is built;
(2), write python file, create python function md5sum, find_files, generate_md5, read_md5, check_md5;
(3), md5list file is generated by calling generate_md5 function;
(4) the batch verification of file, is carried out by calling check_md5 function.
2. a kind of files in batch method of calibration based on python according to claim 1, is characterized in that, in step (2), carrying out iteration reading in md5sum function with scale-of-two reading mode to file, and each reading size is 8k.
3. a kind of files in batch method of calibration based on python according to claim 1, is characterized in that in step (3), the md5 value that md5list file include file path is corresponding with it.
4. a kind of files in batch method of calibration based on python according to claim 1, is characterized in that in step (4), and whether whether check_md5 function exists according to the file under md5list file checking current path mates with md5; Check_md5 function exports unmatched file name and returns results list.
5., based on a files in batch method of calibration of python, it is characterized in that detailed step is as follows:
1. the environment of more than python2.6, is built;
2. the catalogue of required verification file, is entered;
3., create a python function md5sum, its function is as follows: the example m creating a hashlib.md5; Create a function read_block, act as and open specified file with scale-of-two reading mode, from the initial position of file, reading size is successively the content of 8k, and uses the update method of m to upgrade md5 value; Return 32 16 binary digits of m;
4., create a python function f ind_files, its function is as follows: the absolute path of files all under obtaining specified path;
5., create a python function generate_md5, its function is as follows: recursive call step 3. with step 4., under current directory, generate the file of a md5list, the md5 value that include file path is corresponding with it;
6., create a python function read_md5, its function is as follows: read the content in md5list file, save as a dictionary and return according to the form of { filename: md5 };
7., create a python function check_md5, its function is as follows: invocation step 5. in read_md5 function, obtain file original md5 value; The absolute path of files all under obtaining specified path; Recursive call step 3. in md5sum method, obtain the md5 value of All Files, and compare in original md5 value, the filename that Output rusults is different also returns.
CN201410684364.6A 2014-11-25 2014-11-25 A kind of files in batch method of calibration based on python Active CN104376098B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410684364.6A CN104376098B (en) 2014-11-25 2014-11-25 A kind of files in batch method of calibration based on python

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410684364.6A CN104376098B (en) 2014-11-25 2014-11-25 A kind of files in batch method of calibration based on python

Publications (2)

Publication Number Publication Date
CN104376098A true CN104376098A (en) 2015-02-25
CN104376098B CN104376098B (en) 2017-06-30

Family

ID=52555005

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410684364.6A Active CN104376098B (en) 2014-11-25 2014-11-25 A kind of files in batch method of calibration based on python

Country Status (1)

Country Link
CN (1) CN104376098B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106095462A (en) * 2016-06-22 2016-11-09 南京南瑞继保电气有限公司 A kind of embedded distribution system program configuration version management method
CN109254949A (en) * 2018-07-18 2019-01-22 北京深度智耀科技有限公司 A kind of method and device of document process
CN109471617A (en) * 2018-11-02 2019-03-15 郑州云海信息技术有限公司 A kind of enterprise's working hour submission method based on Python
CN114723419A (en) * 2022-04-29 2022-07-08 中汽研汽车检验中心(武汉)有限公司 Python-based method for automatically generating load layout curve documents in batches

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100287196A1 (en) * 2007-12-21 2010-11-11 Thomas Clay Shields Automated forensic document signatures
CN103577319A (en) * 2012-08-07 2014-02-12 腾讯科技(深圳)有限公司 Source code file detection method, source code file detection device and file release system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100287196A1 (en) * 2007-12-21 2010-11-11 Thomas Clay Shields Automated forensic document signatures
CN103577319A (en) * 2012-08-07 2014-02-12 腾讯科技(深圳)有限公司 Source code file detection method, source code file detection device and file release system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
马文杰: ""基于CAP理论的海量数据存储研究与应用"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106095462A (en) * 2016-06-22 2016-11-09 南京南瑞继保电气有限公司 A kind of embedded distribution system program configuration version management method
CN109254949A (en) * 2018-07-18 2019-01-22 北京深度智耀科技有限公司 A kind of method and device of document process
CN109471617A (en) * 2018-11-02 2019-03-15 郑州云海信息技术有限公司 A kind of enterprise's working hour submission method based on Python
CN114723419A (en) * 2022-04-29 2022-07-08 中汽研汽车检验中心(武汉)有限公司 Python-based method for automatically generating load layout curve documents in batches
CN114723419B (en) * 2022-04-29 2023-04-07 中汽研汽车检验中心(武汉)有限公司 Python-based method for automatically generating load layout curve documents in batches
WO2023208251A1 (en) * 2022-04-29 2023-11-02 中汽研汽车检验中心(武汉)有限公司 Python-based method for automatically generating load distribution curve documents in batches

Also Published As

Publication number Publication date
CN104376098B (en) 2017-06-30

Similar Documents

Publication Publication Date Title
KR102582580B1 (en) Electronic Apparatus for detecting Malware and Method thereof
CN106126290B (en) Application program optimization method, apparatus and system
US10169034B2 (en) Verification of backward compatibility of software components
US8296535B2 (en) Generating incremental program updates
US11042427B2 (en) Automated consolidation of API specifications
CN103955363A (en) Manufacturing method of program upgrade and installation package
CN104376098A (en) File batch validation method based on python
CN104123481A (en) Method and device for preventing application program from being tampered
US10216510B2 (en) Silent upgrade of software with dependencies
CN105468396A (en) Generating method for differential package, upgrading method, generating apparatus, and Linux terminal
US10394756B2 (en) System and method for customizing archive of a device driver generator tool for a user
CN104769598B (en) System and method for detecting unauthorized applications
CN106502715A (en) A kind of application program collocation method and device by all kinds of means
US20170371631A1 (en) Globalization template manager for automated globalization enablement on development operations
US20170371763A1 (en) Automated globalization enablement on development operations
WO2014056371A1 (en) Method and apparatus for determining range of files to be migrated
CN105068853A (en) Channel package output method and apparatus
KR101520671B1 (en) System and method for analysis executable code based on similarity
US10521253B2 (en) Framework for automated globalization enablement on development operations
US10296743B2 (en) Method and device for constructing APK virus signature database and APK virus detection system
US8413132B2 (en) Techniques for resolving read-after-write (RAW) conflicts using backup area
Haryono et al. Androevolve: Automated update for android deprecated-api usages
US10318262B2 (en) Smart hashing to reduce server memory usage in a distributed system
CN102236698A (en) Embeddable project data
US9075679B1 (en) Creating a prerequisite checklist corresponding to a software application

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20180814

Address after: 250101 S06 tower, 1036, Chao Lu Road, hi tech Zone, Ji'nan, Shandong.

Patentee after: Shandong wave cloud Mdt InfoTech Ltd

Address before: No. 1036, Shun Ya Road, Ji'nan high tech Zone, Shandong Province

Patentee before: Langchao Electronic Information Industry Co., Ltd.

CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 250100 No. 1036 Tidal Road, Jinan High-tech Zone, Shandong Province, S01 Building, Tidal Science Park

Patentee after: Inspur cloud Information Technology Co., Ltd

Address before: 250101 S06 tower, 1036, Chao Lu Road, hi tech Zone, Ji'nan, Shandong.

Patentee before: SHANDONG LANGCHAO YUNTOU INFORMATION TECHNOLOGY Co.,Ltd.