Wednesday, April 29, 2020

File Handling

File handling is all about reading, writing and updating the file.

Writing data into simple text file

fo=open("helloworld.txt",mode='w')
fo.write('1234567890')
fo.close()

Helloworld.txt file was not there initially but, after executing the code the file is created and data is written in the file.

Reading Data from helloworld.txt file.

When file in not present.

fo=open("helloworld.txt",mode='r')
str = fo.read()
print(str)
fo.close()

>>>
== RESTART: E:\Software\Google Drive\PRWATECH\BATCH-A\LearningFileHandling.py ==
Traceback (most recent call last):
  File "E:\Software\Google Drive\PRWATECH\BATCH-A\LearningFileHandling.py", line 1, in <module>
    fo=open("helloworld.txt",mode='r')
FileNotFoundError: [Errno 2] No such file or directory: 'helloworld.txt'
>>> 


When file is present.

fo=open("helloworld.txt",mode='r')
str = fo.read()
print(str)
fo.close()

>>>
== RESTART: E:\Software\Google Drive\PRWATECH\BATCH-A\LearningFileHandling.py ==
1234567890
>>> 


Reading data for specific number of bytes.

fo=open("helloworld.txt",mode='r')
str = fo.read(5)#This will 5 bytes of the text file
print(str)
fo.close()


Check current position    of pointer.
fo=open("helloworld.txt",mode='r')
str = fo.read(5)#This will read 5 bytes of the text file
print(str)
pos =  fo.tell()#After reading the 5 bytes of data position of pointer is 5
print(pos)
fo.close()


Reposition pointer at the beginning once again

fo=open("helloworld.txt",mode='r')
str = fo.read(5)#This will read 5 bytes of the text file
print(str)
pos =  fo.tell()#After reading the 5 bytes of data position of pointer is 5
print(pos)
pos=fo.seek(0,0)
print(pos)
fo.close()

Syntax: f.seek(offset, from_what), where f is file pointer
Parameters:
Offset: Number of postions to move forward
from_what: It defines point of reference.
Returns: Does not return any value

The reference point is selected by the from_what argument. It accepts three values:

0: sets the reference point at the beginning of the file
1: sets the reference point at the current file position
2: sets the reference point at the end of the file

By default from_what argument is set to 0.

Readline method of read mode

The method readline()reads one entire line from the file. A trailing newline character is kept in the string

fo=open("helloworld.txt",mode='r')
##str = fo.read(5)#This will read 5 bytes of the text file
str=fo.readline()#This print only first line of the text file and 
print(str)
pos =  fo.tell()#After reading the 5 bytes of data position of pointer is 5
print(pos)
pos=fo.seek(0,0)
print(pos)
fo.close()

>>>
== RESTART: E:\Software\Google Drive\PRWATECH\BATCH-A\LearningFileHandling.py ==
12
0
1234567890
#This is empty line.
>>> 
Readlines method of read mode

The method readline()reads all the the file. The lines are stored in list

fo=open("helloworld.txt",mode='r')
##str = fo.read(5)#This will read 5 bytes of the text file
str=fo.readlines()#This print all the lines in the list Datatype 
print(str)
pos =  fo.tell()#After reading the 5 bytes of data position of pointer is 5
print(pos)
pos=fo.seek(0,0)
print(pos)
fo.close()

Output:

>>>
== RESTART: E:\Software\Google Drive\SOMEFOLDER\BATCH-A\LearningFileHandling.py ==
60
0
['1234567890\n', '1234567890\n', '1234567890\n', '1234567890\n', '1234567890\n']
>>> 

Different type of mode to manipulate the files.


SrNo
Modes
Modes
Truncate
If File Not present Create New
File pointer
1
r
read only
N
N
Start of the file
2
w
Write only
Y
Y
Start of the file
3
a
Apend only(Write)
N
Y
End of file
4
r+
readwrite
N
N
Start of the file
5
w+
writeread
Y
Y
Start of the file
6
a+
Apend Read(Write/Read)
N
Y
End of file



Test Case 1) Read Only.

fo=open("helloworld.txt",mode="r")
print (“Position before reading the file”,fo.tell())
str=fo.read()
print (“Position after reading the file”,fo.tell())
print (str)
fo.close()



Test Case 2) Write Only

In this case, when the file is not present it will create new file, if the file already exist , it will overwrite the file with the new string.

fo=open("helloworld.txt",mode="w")
print (fo.tell())
fo.write('987654')
print (fo.tell())
fo.close()


Test Case3) Append

In this case , when the file is not present it will create new file, if the fie is already exist with some data, it will append the new data in the file.

fo=open("helloworld.txt",mode="a")
print ('The  position of the pointer before operation ',fo.tell())
fo.write('Appended Data')
print ('The  position of the pointer after operation ',fo.tell())
print (fo.tell())
fo.close()


Output :
>>>
== RESTART: E:\Software\Google Drive\PRWATECH\BATCH-A\LearningFileHandling.py ==
The  position of the pointer before operation  19
The  position of the pointer after operation  32
32

>>> 
Notice the initial pointer of the file. It is at the end of file.i.e 19.
Test Case 4) Read plus (r+)

This mode is used to both read and write. When the file is NOT present it will not create any new file.The initial position of the pointer is at the start of file.

##Reading the file

fo=open ("helloworld.txt",mode="r+")
print ('The  position of the pointer before operation ',fo.tell())
str=fo.read()
#fo.write('Appended Data')
Print ('The  position of the pointer after operation ',fo.tell())
Print (fo.tell())
Print (str)
fo.close()

##Writing the file

Lets say the file helloworld.txt has data 12345.

When the below code is executed. The file is updated with A2345, because it does not truncate the data from the file, and since the pointer is at the beginning, it start overwriting the text from the beginning of the file.

fo=open ("helloworld.txt",mode="r+")
print ('The  position of the pointer before operation ',fo.tell())
##str=fo.read()
fo.write('A')
print ('The  position of the pointer after operation ',fo.tell())
print (fo.tell())
##print(str)
fo.close()

Test Case 5) Write plus (w+)
This mode create the file if not present, if present , it will truncate the data before writing in the file, the pointer is start of the file.

#Writing the data

Lets say the file helloworld.txt has data A2345. When the below code is executed, then file data is replaced with ABCDE


fo=open ("helloworld.txt",mode="w+")
print('The  position of the pointer before operation ',fo.tell())
##str=fo.read()
fo.write('ABCDE')
print ('The  position of the pointer after operation ',fo.tell())
print (fo.tell())
##print(str)
fo.close()


Test Case 6) Append plus(a+)

This mode is use to append the data in the existing file . Data is not overwritten in the file. We can also use this mode for reading the file as well. The file pointer is end of file.

fo=
open ("helloworld.txt",mode="a+")
print('The  position of the pointer before operation ',fo.tell())
##str=fo.read()
fo.write('APPENDED TEXT')
print ('The  position of the pointer after operation ',fo.tell())
print (fo.tell())
##print(str)

fo.close()

Output:

The  position of the pointer before operation  5
The  position of the pointer after operation  18



With Statement:

When opening a file there always has to be closure of the file. 

fo=open ("helloworld.txt",mode="a+")
fo.close() ## We always has to ensure the file object has to be closed.

If we don't close the file object, changes will not be reflected to the file.

This can be avoided by using With statement.


with open ("helloworld.txt",mode="a+"as fo:
    print('The  position of the pointer before operation ',fo.tell())
    ##str=fo.read()
    fo.write('APPENDED TEXT')
    print ('The  position of the pointer after operation ',fo.tell())
    print (fo.tell())



This does not require an explicit file closure. WITH statement takes care of that.

Reading CSV File:

What is CSV file:

A Comma Separated Values (CSV) file is a plain text file that contains a list of data. These files are often used for exchanging data between different applications.

These files may sometimes be called Character Separated Values or Comma Delimited files. They mostly use the comma character to separate (or delimit) data, but sometimes use other characters, like semicolons. The idea is that you can export complex data from one application to a CSV file, and then import the data in that CSV file into another application.

How to Read CSV file in Python

# importing csv module 
import csv 
 # csv file name 
filename = "FL_insurance_sample.csv"
  # initializing the titles and rows list 
rows = [] 
# reading csv file 
with open(filename, 'r'as csvfile: 
    # creating a csv reader object 
    csvreader = csv.reader(csvfile,delimiter=',')
    print(csvreader)
    # extracting each data row one by one 
    for row in csvreader:
        print(row)
        rows.append(row) 
    # get total number of rows 
    print("Total no. of rows: %d"%(csvreader.line_num)) 
for row in rows:
  print(row)


How to Write CSV 
1) Row by Row
CSV can be written row by row by using writer method. Can you used in Loop to write data into CSV file. 'Default delimiters is Comma.
import csv
with open('employee1.csv''w'newline=''as file:
    writer = csv.writer(file)
    writer.writerow(["empid""Name""salary"])
    writer.writerow([1"Sandro""1000"])
    writer.writerow([2"Warren""2000"])
writer.writerow([3"Maria""3000"])
2) All Row at once:


import csv
rowlist=[
            ["empid""Name""salary"],
            [1"Sandro""1000"],
            [2"Warren""2000"],
            [3"Maria""3000"]
        ]
with open('employee1.csv''w'newline=''as file:
    writer = csv.writer(file)
    writer.writerows(rowlist)


CSV File with Custom Delimiters.


import csv
rowlist=[
            ["empid""Name""salary"],
            [1"Sandro""1000"],
            [2"Warren""2000"],
            [3"Maria""3000"]
        ]
with open('employee1.csv''w'newline=''as file:
    writer = csv.writer(file,delimiter=";")
    writer.writerows(rowlist)


CSV Files with Quotes
import csv
rowlist=[
            ["empid""Name""salary"],
            [1"Sandro""1000"],
            [2"Warren""2000"],
            [3"Maria""3000"]
        ]
with open('employee1.csv''w'newline=''as file:
    writer = csv.writer(file,delimiter=";"
                              quoting=csv.QUOTE_NONNUMERIC)
    writer.writerows(rowlist)


As you can see, we have passed csv.QUOTE_NONNUMERIC to the quoting parameter. 
It is a constant defined by the csv module.
csv.QUOTE_NONNUMERIC specifies the writer object that quotes should be added around the non-numeric entries.
There are 3 other predefined constants you can pass to the quoting parameter:
csv.QUOTE_ALL - Specifies the writer object to write CSV file with quotes around all the entries.
csv.QUOTE_MINIMAL - Specifies the writer object to only quote those fields which contain special characters (delimiter, quotechar or any characters in lineterminator)

csv.QUOTE_NONE - Specifies the writer object that none of the entries should be quoted. It is the default value.

CSV files with custom quoting character


import csv
rowlist=[
            ["empid""Name""salary"],
            [1"Sandro""1000"],
            [2"Warren""2000"],
            [3"Maria""3000"]
        ]
with open('employee1.csv''w'newline=''as file:
    writer = csv.writer(file,
                             delimiter=";",
                             quoting=csv.QUOTE_NONNUMERIC,
                            quotechar="*"
                        )
    writer.writerows(rowlist)
Dialects in CSV module

import csv
rowlist=[
            ["empid""Name""salary"],
            [1"Sandro""1000"],
            [2"Warren""2000"],
            [3"Maria""3000"]
        ]
csv.register_dialect('myDialect',
                     delimiter='|',
                     quoting=csv.QUOTE_ALL,
                     quotechar="*"                     )
with open('employee1.csv''w',newline=''as file:
    writer = csv.writer(file,dialect="myDialect")
    writer.writerows(rowlist)

CSV With DictWriter

Writing Dictionary into CSV file

import csv 
#Create dictionary of the data, each dictionary object will consider as row 
mydict =[
          {'id''123''name''Sandro''salary''10000''department''Mechanical'},
          {'id''124''name''Monica''salary''20000''department''Electrical'},
          {'id''125''name''Adler''salary''30000''department''IT'},
          {'id''126''name''Theodor''salary''40000''department''Support'}
        ] 
# Set field names (Field Name is Column Names)
fields = ['id''name''salary''department'
# Set name of csv file 
filename = "employee.csv"
# writing to csv file 
with open(filename, 'w'as csvfile: 
    # creating a csv dict writer object 
    writer = csv.DictWriter(csvfile, fieldnames = fields) 
    # writing headers (field names) 
    writer.writeheader() 
    # writing data rows 
    writer.writerows(mydict) 


Reading CSV file into Dictionary


import csv 
filename ="employee.csv"
with open(filename, 'r'as data: 
      
    for line in csv.DictReader(data): 
        print(line) 



Output::

{'id': '123', 'name': 'Sandro', 'salary': '10000', 'department': 'Mechanical'}
{'id': '124', 'name': 'Monica', 'salary': '20000', 'department': 'Electrical'}
{'id': '125', 'name': 'Adler', 'salary': '30000', 'department': 'IT'}

{'id': '126', 'name': 'Theodor', 'salary': '40000', 'department': 'Support'}



JSON Parsing.

What is JSON?
JSON or JavaScript Object Notation is a format for structuring data.

What is it used for?
Like XML, it is one of the way of formatting the data. Such format of data is used by web applications to communicate with each other.
Properties of JSON
·         It is Human-readable and writable.
·         It is light weight text based data interchange format which means, it is simpler to read and write when compared to XML.
·         Though it is derived from a subset of JavaScript, yet it is Language independent. Thus, the code for generating and parsing JSON data can be written in any other programming language
For more information on JSON.

A simple JSON data example:

JSON String'{"id": "123", "name": "Sandro", "salary": "10000", "department": "Mechanical"}'

Note: JSON Data is exactly similar to Dictionary in python.

JSON data into python.

import json 
  # JSON Data
jsondata='{"id": "123", "name": "Sandro", "salary": "10000", "department": "Mechanical"}'
  
# Convert json data to Python dict 
python_dict = json.loads(jsondata) 
print(python_dict)


Output:

{'id': '123', 'name': 'Sandro', 'salary': '10000', 'department': 'Mechanical'}



Python Data into JSON Object 

import json 
pydict={"id""123""name""Sandro""salary""10000""department""Mechanical"}
json_object = json.dumps(pydict, indent = 4
print(json_object) 

Output:
{
    "id""123",
    "name""Sandro",
    "salary""10000",
    "department""Mechanical"
}

Python Data into JSON File
import json 
pydict={"id""123""name""Sandro""salary""10000""department""Mechanical"}
with open("employee.json""w"as f: 
    json.dump(pydict, f)

JSON File into Python Data
import json 
with open('employee.json',) as f: # Open JSON File
  pydict = json.load(f) 
print(pydict)

Output:
{'id': '123', 'name': 'Sandro', 'salary': '10000', 'department': 'Mechanical'}

Zipping the data files in Python


What is Zip File

ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.[R].

How to Zip File

A very simple code to zip the file.

from zipfile import ZipFile

with ZipFile("sample2.zip"mode='w',allowZip64=Trueas z:
  z.write("E:/Software/Google Drive/PRWATECH/BATCH-A/sample/test.txt","test.txt")


A more sophisticated code, though not the best, but if we wish to zip the files based on directory.


import os
from zipfile import ZipFile

directory="E:\Software\Google Drive\PRWATECH\BATCH-A\sample"

with ZipFile("sample2.zip"mode='w',allowZip64=Trueas z:
  for root, directories, files in os.walk(directory): 
      for filename in files: 
        filepath = os.path.join(root, filename)
        z.write(filepath,filename)


Unzipping the File

from zipfile import ZipFile 
file_name = "sample2.zip"
with ZipFile(file_name, 'r'as zip
    zip.printdir() 
    zip.extractall() 
    print('Done!'

      

What is Pickling

The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy[R].

Warning:

1)       Pickle module is not secure.
2)       Only unpickle trusted source.
3)       There are chances that, while unpickling the data, a malicious code could execute.          

How to Pickle a python Object

import pickle 
lst=[1,2,3,4,5,6]
with  open('mypickledfile''ab'as f:
  pickle.dump(lst, f)    



How to unpickle a binary file.

with open('mypickledfile''rb')  as f:
  pyobject = pickle.load(f) 
  print(type(pyobject))
  print(pyobject)

The data format used by pickle is Python-specific. It means that non-Python programs may not be able to reconstruct pickled Python objects.
By default, the pickle data format uses a relatively compact binary representation.



Logging and Debugging


What is logging:

Suppose if an application encounter an exception. In the program, we may have handled the exception, to avoid unwanted termination of the program. But how do we identify, the exception, unless and until it is being written somewhere in the file. This is called as logging an event, exception of the program. With logging, you can leave a trail of breadcrumbs so that if something goes wrong, we can determine the cause of the problem.
Python has a built-in module logging which allows writing status messages to a file or any other output streams. The file can contain the information on which part of the code is executed and what problems have been arisen.

Below are built-in levels of the log message.
·         Debug : These ar
·         e used to give Detailed information, typically of interest only when diagnosing problems.
·         Info : These are used to Confirm that things are working as expected
·         Warning : These are used an indication that something unexpected happened, or indicative of some problem in the near future
·         Error : This tells that due to a more serious problem, the software has not been able to perform some function
·         Critical : This tells serious error, indicating that the program itself may be unable to continue running

How to log the information

import logging 
logging.basicConfig(filename='example.log',level=logging.DEBUG,filemode="w")

logging.debug('The code started from here')
try:
  a=9
  deno=[9,8,7,6,5,4,3,2,1,0]
  
  for b in deno:
    logging.info("===================================================")
    logging.info("The value of divisor b is ::"+str(b))
    if b<=4 and b>=1:
        logging.warning("The denominator is close to zero")
    elif b==0:
        logging.critical("The denominator zero wont be able to divide")
    logging.debug('Before Division')
    c=a/b
    logging.info("The value of  a/b is ::"+str(c))
    logging.debug('After Division')

    logging.info('The division is successfully')
except Exception as e:
  logging.error(e)
  logging.exception(e)
  logging.critical(e)
logging.debug('The code ends  here')