๐Ÿ“œ Understanding PDF Documents as Data Pipelines [2021]

By: Jane Doe, John Smith

This paper proposes a novel framework for conceptualizing PDF documents as data pipelines to address challenges in data extraction and document analysis.

๐Ÿ“– https://lobste.rs/t/pdf

#paperswelove #research #compsci

Spreadsheet paradigm transition.

"people and businesses fail to distinguish between data processing and data analysis and visualization"

"It's too easy to lose data. It's easy for data to be altered."

#SkillsMismatch #ComputerScience #CompSci

https://www.bbc.com/news/articles/cwyxkzjpp87o

Excel: Workers cling to the software despite shift to AI

Companies are trying to wean staff off Excel spreadsheets to centralise control of their data.

#MULCIA: PhD position in AI4Math at Chalmers. https://1pt.co/t8fd7 #PhD #AI4Math #CompSci
Alright, future engineers!
**Modular Arithmetic (a mod n):** Finds the remainder when 'a' is divided by 'n'. Numbers wrap around.
Ex: 7 mod 3 = 1 (because 7 = 2*3 + 1).
Pro-Tip: Crucial for time math (clocks!), cryptography, and hashing algorithms.
#DiscreteMath #CompSci #STEM #StudyNotes

๐Ÿ“œ An Introduction to Capsicum [2015]

By: Justin Cormack

The paper "An Introduction to Capsicum" by Justin Cormack provides a detailed overview of Capsicum, a framework that enhances application and system security through application compartmentalization.

๐Ÿ“– https://www.netbsd.org/gallery/presentations/justin/2015_AsiaBSDCon/justincormack-abc2015.pdf

#paperswelove #research #compsci

Learning Regular Languages with the TTT Algorithm

1 comment

Lobsters

I might have come up with an less efficient counting sort alternative :)

https://feddit.org/post/31025476

I might have come up with an less efficient counting sort alternative :) - feddit.org

The idea is to search for the highest number, and then to create an 2d-Array, whose length equals the highest number. Afterwards all items of the unsorted array are placed in the sorted array into the place with the index that equals their number. So every 0 is placed into the first array of the sorted array, every 2 is placed in the third array of the sorted array, every highest number is placed in the last array of the sorted array. In the end an array that may look like this: [[0,0],[],[1],[],[2], [4,4,4]] will be compiled into an proper 1-d array (in this case: [0,0,1,2,4,4,4]). Hereโ€™s the python code: import time import numpy as np start = time.time() def proto_sort(arr): highest_value = 0 for item in arr: if item > highest_value: highest_value = item sorted_arr = [[]] * (highest_value + 1) for num in arr: sorted_arr[num] = sorted_arr[num] + [num] output_arr = [None] * len(arr) num = 0 for item in sorted_arr: for j in item: if j != None: output_arr[num] = j num += 1 # Example usage arr = [] for i in range(100): arr.append(np.random.randint(0, 100)) sorted_arr = proto_sort(arr) end = time.time() print(end - start) (Edits for clarification)

what 262,715 regex questions on stack overflow haven't answered (part 2)

0 comments

Lobsters