File Operations¶

load_json(json_file, **kwargs)¶

Open and load data from a JSON file

reusables.load_json("example.json")
# {u'key_1': u'val_1', u'key_for_dict': {u'sub_dict_key': 8}}

Parameters:	json_file – Path to JSON file as string kwargs – Additional arguments for the json.load command
Returns:	Dictionary

list_to_csv(my_list, csv_file)¶

Save a matrix (list of lists) to a file as a CSV

my_list = [["Name", "Location"],
           ["Chris", "South Pole"],
           ["Harry", "Depth of Winter"],
           ["Bob", "Skull"]]

reusables.list_to_csv(my_list, "example.csv")

example.csv

Parameters:	my_list – list of lists to save to CSV csv_file – File to save data to

save_json(data, json_file, indent=4, **kwargs)¶

Takes a dictionary and saves it to a file as JSON

my_dict = {"key_1": "val_1",
           "key_for_dict": {"sub_dict_key": 8}}

reusables.save_json(my_dict,"example.json")

example.json

{
    "key_1": "val_1",
    "key_for_dict": {
        "sub_dict_key": 8
    }
}

Parameters:	data – dictionary to save as JSON json_file – Path to save file location as str indent – Format the JSON file with so many numbers of spaces kwargs – Additional arguments for the json.dump command

csv_to_list(csv_file)¶

Open and transform a CSV file into a matrix (list of lists).

reusables.csv_to_list("example.csv")
# [['Name', 'Location'],
#  ['Chris', 'South Pole'],
#  ['Harry', 'Depth of Winter'],
#  ['Bob', 'Skull']]

Parameters:	csv_file – Path to CSV file as str
Returns:	list

extract(archive_file, path='.', delete_on_success=False, enable_rar=False)¶

Automatically detect archive type and extract all files to specified path.

import os

os.listdir(".")
# ['test_structure.zip']

reusables.extract("test_structure.zip")

os.listdir(".")
# [ 'test_structure', 'test_structure.zip']

Parameters:	archive_file – path to file to extract path – location to extract to delete_on_success – Will delete the original archive if set to True enable_rar – include the rarfile import and extract
Returns:	path to extracted files

archive(files_to_archive, name='archive.zip', archive_type=None, overwrite=False, store=False, depth=None, err_non_exist=True, allow_zip_64=True, **tarfile_kwargs)¶

Archive a list of files (or files inside a folder), can chose between

zip

tar

gz (tar.gz, tgz)

bz2 (tar.bz2)

reusables.archive(['reusables', '.travis.yml'],
                      name="my_archive.bz2")
# 'C:\Users\Me\Reusables\my_archive.bz2'

Parameters:

files_to_archive – list of files and folders to archive
name – path and name of archive file
archive_type – auto-detects unless specified
overwrite – overwrite if archive exists
store – zipfile only, True will not compress files
depth – specify max depth for folders
err_non_exist – raise error if provided file does not exist
allow_zip_64 – must be enabled for zip files larger than 2GB
tarfile_kwargs – extra args to pass to tarfile.open

Returns:

path to created archive

config_dict(config_file=None, auto_find=False, verify=True, **cfg_options)¶

Return configuration options as dictionary. Accepts either a single config file or a list of files. Auto find will search for all .cfg, .config and .ini in the execution directory and package root (unsafe but handy).

reusables.config_dict(os.path.join("test", "data", "test_config.ini"))
# {'General': {'example': 'A regular string'},
#  'Section 2': {'anint': '234',
#                'examplelist': '234,123,234,543',
#                'floatly': '4.4',
#                'my_bool': 'yes'}}

Parameters:	config_file – path or paths to the files location auto_find – look for a config type file at this location or below verify – make sure the file exists before trying to read cfg_options – options to pass to the parser
Returns:	dictionary of the config files

config_namespace(config_file=None, auto_find=False, verify=True, **cfg_options)¶

Return configuration options as a Namespace.

reusables.config_namespace(os.path.join("test", "data",
                                        "test_config.ini"))
# <Namespace: {'General': {'example': 'A regul...>

Parameters:	config_file – path or paths to the files location auto_find – look for a config type file at this location or below verify – make sure the file exists before trying to read cfg_options – options to pass to the parser
Returns:	Namespace of the config files

os_tree(directory, enable_scandir=False)¶

Return a directories contents as a dictionary hierarchy.

reusables.os_tree(".")
# {'doc': {'build': {'doctrees': {},
#                   'html': {'_sources': {}, '_static': {}}},
#         'source': {}},
#  'reusables': {'__pycache__': {}},
#  'test': {'__pycache__': {}, 'data': {}}}

Parameters:	directory – path to directory to created the tree of. enable_scandir – on python < 3.5 enable external scandir package
Returns:	dictionary of the directory

check_filename(filename)¶

Returns a boolean stating if the filename is safe to use or not. Note that this does not test for “legal” names accepted, but a more restricted set of: Letters, numbers, spaces, hyphens, underscores and periods.

Parameters:	filename – name of a file as a string
Returns:	boolean if it is a safe file name

count_files(*args, **kwargs)¶: Returns an integer of all files found using find_files

directory_duplicates(directory, hash_type='md5', **kwargs)¶

Find all duplicates in a directory. Will return a list, in that list are lists of duplicate files.

Parameters:	directory – Directory to search hash_type – Type of hash to perform kwargs – Arguments to pass to find_files to narrow file types
Returns:	list of lists of dups

dup_finder(file_path, directory='.', enable_scandir=False)¶

Check a directory for duplicates of the specified file. This is meant for a single file only, for checking a directory for dups, use directory_duplicates.

This is designed to be as fast as possible by doing lighter checks before progressing to more extensive ones, in order they are:

File size
First twenty bytes
Full SHA256 compare

list(reusables.dup_finder(
     "test_structure\files_2\empty_file"))
# ['C:\Reusables\test\data\fake_dir',
#  'C:\Reusables\test\data\test_structure\Files\empty_file_1',
#  'C:\Reusables\test\data\test_structure\Files\empty_file_2',
#  'C:\Reusables\test\data\test_structure\files_2\empty_file']

Parameters:	file_path – Path to file to check for duplicates of directory – Directory to dig recursively into to look for duplicates enable_scandir – on python < 3.5 enable external scandir package
Returns:	generators

file_hash(path, hash_type='md5', block_size=65536, hex_digest=True)¶

Hash a given file with md5, or any other and return the hex digest. You can run hashlib.algorithms_available to see which are available on your system unless you have an archaic python version, you poor soul).

This function is designed to be non memory intensive.

reusables.file_hash(test_structure.zip")
# '61e387de305201a2c915a4f4277d6663'

Parameters:	path – location of the file to hash hash_type – string name of the hash to use block_size – amount of bytes to add to hasher at a time hex_digest – returned as hexdigest, false will return digest
Returns:	file’s hash

find_files(directory='.', ext=None, name=None, match_case=False, disable_glob=False, depth=None, abspath=False, enable_scandir=False)¶

Walk through a file directory and return an iterator of files that match requirements. Will autodetect if name has glob as magic characters.

Note: For the example below, you can use find_files_list to return as a list, this is simply an easy way to show the output.

list(reusables.find_files(name="ex", match_case=True))
# ['C:\example.pdf',
#  'C:\My_exam_score.txt']

list(reusables.find_files(name="*free*"))
# ['C:\my_stuff\Freedom_fight.pdf']

list(reusables.find_files(ext=".pdf"))
# ['C:\Example.pdf',
#  'C:\how_to_program.pdf',
#  'C:\Hunks_and_Chicks.pdf']

list(reusables.find_files(name="*chris*"))
# ['C:\Christmas_card.docx',
#  'C:\chris_stuff.zip']

Parameters:

directory – Top location to recursively search for matching files
ext – Extensions of the file you are looking for
name – Part of the file name
match_case – If name or ext has to be a direct match or not
disable_glob – Do not look for globable names or use glob magic check
depth – How many directories down to search
abspath – Return files with their absolute paths
enable_scandir – on python < 3.5 enable external scandir package

Returns:

generator of all files in the specified directory

find_files_list(*args, **kwargs)¶: Returns a list of find_files generator

join_here(*paths, **kwargs)¶

Join any path or paths as a sub directory of the current file’s directory.

reusables.join_here("Makefile")
# 'C:\Reusables\Makefile'

Parameters:	paths – paths to join together kwargs – ‘strict’, do not strip os.sep kwargs – ‘safe’, make them into a safe path it True
Returns:	abspath as string

join_paths(*paths, **kwargs)¶

Join multiple paths together and return the absolute path of them. If ‘safe’ is specified, this function will ‘clean’ the path with the ‘safe_path’ function. This will clean root decelerations from the path after the first item.

Would like to do ‘safe=False’ instead of ‘**kwargs’ but stupider versions of python cough 2.6 don’t like that after ‘*paths’.

Parameters:	paths – paths to join together kwargs – ‘safe’, make them into a safe path it True
Returns:	abspath as string

remove_empty_directories(root_directory, dry_run=False, ignore_errors=True, enable_scandir=False)¶

Remove all empty folders from a path. Returns list of empty directories.

Parameters:	root_directory – base directory to start at dry_run – just return a list of what would be removed ignore_errors – Permissions are a pain, just ignore if you blocked enable_scandir – on python < 3.5 enable external scandir package
Returns:	list of removed directories

remove_empty_files(root_directory, dry_run=False, ignore_errors=True, enable_scandir=False)¶

Remove all empty files from a path. Returns list of the empty files removed.

Parameters:	root_directory – base directory to start at dry_run – just return a list of what would be removed ignore_errors – Permissions are a pain, just ignore if you blocked enable_scandir – on python < 3.5 enable external scandir package
Returns:	list of removed files

safe_filename(filename, replacement='_')¶

Replace unsafe filename characters with underscores. Note that this does not test for “legal” names accepted, but a more restricted set of: Letters, numbers, spaces, hyphens, underscores and periods.

Parameters:	filename – name of a file as a string replacement – character to use as a replacement of bad characters
Returns:	safe filename string

safe_path(path, replacement='_')¶

Replace unsafe path characters with underscores. Do NOT use this with existing paths that cannot be modified, this to to help generate new, clean paths.

Supports windows and *nix systems.

Parameters:	path – path as a string replacement – character to use in place of bad characters
Returns:	a safer path

touch(path)¶

Native ‘touch’ functionality in python

Parameters:	path – path to file to ‘touch’