๐ Understanding PDF Documents as Data Pipelines [2021]
By: Jane Doe, John Smith
This paper proposes a novel framework for conceptualizing PDF documents as data pipelines to address challenges in data extraction and document analysis.
Spreadsheet paradigm transition.
"people and businesses fail to distinguish between data processing and data analysis and visualization"
"It's too easy to lose data. It's easy for data to be altered."
๐ An Introduction to Capsicum [2015]
By: Justin Cormack
The paper "An Introduction to Capsicum" by Justin Cormack provides a detailed overview of Capsicum, a framework that enhances application and system security through application compartmentalization.
๐ https://www.netbsd.org/gallery/presentations/justin/2015_AsiaBSDCon/justincormack-abc2015.pdf
I might have come up with an less efficient counting sort alternative :)
The idea is to search for the highest number, and then to create an 2d-Array, whose length equals the highest number. Afterwards all items of the unsorted array are placed in the sorted array into the place with the index that equals their number. So every 0 is placed into the first array of the sorted array, every 2 is placed in the third array of the sorted array, every highest number is placed in the last array of the sorted array. In the end an array that may look like this: [[0,0],[],[1],[],[2], [4,4,4]] will be compiled into an proper 1-d array (in this case: [0,0,1,2,4,4,4]). Hereโs the python code: import time import numpy as np start = time.time() def proto_sort(arr): highest_value = 0 for item in arr: if item > highest_value: highest_value = item sorted_arr = [[]] * (highest_value + 1) for num in arr: sorted_arr[num] = sorted_arr[num] + [num] output_arr = [None] * len(arr) num = 0 for item in sorted_arr: for j in item: if j != None: output_arr[num] = j num += 1 # Example usage arr = [] for i in range(100): arr.append(np.random.randint(0, 100)) sorted_arr = proto_sort(arr) end = time.time() print(end - start) (Edits for clarification)