Parquet2CSV
- class geoanalytics.conversion.Parquet2CSV.Parquet2CSV(inputFile, outputFile, sep)[source]
Bases:
objectAbout this algorithm
- Description:
This class is to convert Parquet format into CSV file.
- Reference:
- Parameters:
inputFile (str) – Path to the input Parquet file.
outputFile (str) – Path to the output CSV file.
sep (str) – This variable is used to distinguish items from one another. The default seperator is tab space. However, the users can override their default separator.
- Attributes:
getMemoryUSS (int) – Returns the memory used by the process in USS.
getMemoryRSS (int) – Returns the memory used by the process in RSS.
getRuntime() (float) – Returns the time taken to execute the conversion.
printStats() – * Prints statistics about memory usage and runtime.*
- Methods:
convert() – Reads the Parquet file, converts it to a CSV file, and tracks memory usage and runtime.
Execution methods
Terminal command
Format: (.venv) $ python3 _CSV2Parquet.py <inputFile> <outputFile> <sep> Example Usage: (.venv) $ python3 _CSV2Parquet.py output.parquet sampleDB.csv
Calling from a python program
import PAMI.extras.convert.Parquet2CSV as pc inputFile = 'output.parquet' sep = " " outputFile = 'sampleDB.csv' obj = pc.Parquet2CSV(inputFile, outputFile, sep) obj.convert() obj.printStats()
Credits
The complete program was written by P. Likhitha and revised by Tarun Sreepada under the supervision of Professor Rage Uday Kiran.
- convert()[source]
This function converts the input Parquet file into a CSV file where each row is joined by the specified separator and written to the output file.
- getMemoryRSS()[source]
Returns the memory used by the process in RSS (Resident Set Size).
- Returns:
The total memory (in bytes) used by the process in RAM.
- Return type:
int
- getMemoryUSS()[source]
Returns the memory used by the process in USS (Unique Set Size).
- Returns:
The amount of memory (in bytes) used exclusively by the process
- Return type:
int