Toolbox
Header image: Derived (cropped) from „Framaspace“ by David Revoy, framasoft.org – CC-BY 4.0
Text, Code & Co.
A big part of your studies will be writing or drawing things down, whether it’s program code, the solution to an exercise sheet, a term paper, or something else entirely. The following resources might help you with this.
Text Editors & Development Environments
Visual Studio Code is an ingenious, extensible open source editor from Microsoft. Available for desktop and web, it is suitable for taking notes in lectures, editing exercise sheets, programming projects and so on.
Joplin is an open source Markdown editor with extras (LaTeX, …). Also suitable for taking notes in lectures.
CryptPad allows you to work together on documents (spreadsheets, text, slides, surveys, …). Very similar to Google Docs or Microsoft 365, but all your data is end-to-end encrypted and the application is open source. An account is also not mandatory.
Deepnote hosts Jupyter Notebooks (explained in more detail below) in the cloud that you can edit together. Kind of like Google Colab.
Visualizations
FLACI can draw automata and work with formal languages, grammars and regular expressions. Very handy for lectures like ECL or theoretical computer science.
diagrams.net (formerly draw.io) can draw flowcharts, UI mockups, ER and UML diagrams, among other things. There is also a Visual Studio Code plugin. Tip: If you choose .drawio.png
or .drawio.svg
as extension of the files, the embeddable image file is also the file you edit with diagrams.net.
Minimum Edit Distance Calculator with visualization. Useful for ECL.
LaTeX
Overleaf is a website where you can work together on LaTeX documents (like presentations, project reports or papers). The institute hosts its own instance. Alternatively, there is also a free version of Overleaf.com, but which offers only a much smaller amount of compile time.
LaTeX Tables Editor is a fancy graphical editor for LaTeX tables, so you don’t have to mess with the code for them yourself anymore.
Detexify lets you draw characters (e.g. an ℝ) and outputs the LaTeX code to typeset that character.
LaTeX for Linguists gives an overview of a variety of LaTeX packages relevant to linguistic texts.
LaTeX course a student regularly gave (or gives). All lectures and exercises are available online. Highly recommended for working through from front to back or for quick reference on specific topics.
LaTeX Templates for exercise sheets, seminar papers and theses can be found in our old exams repository at GitLab. If you want to create a presentation with LaTeX, you could take inspiration from the tutorial slides – some of them upload the LaTeX source code to GitLab (e.g. the tutorial for Programmieren II).
Literature
Zotero is a literature management program. With Zotero you can collect all relevant papers, books, etc. mostly with a single click in the browser, and then generate a bibliography later (supported are BibTeX and BibLaTeX as well as office programs such as Word). By the way, the university library regularly offers short introductions to Zotero.
News, documentation and tutorials
Hacker News collects articles, news posts, blogposts on topics related to computers, science and generally everything “that satisfies the intellectual curiosity”. If you want to stay informed about what’s happening in the IT and research world, this website is really recommended.
devdocs.io is a website that provides documentation for many programming languages and libraries in an easily searchable manner via a unified interface. If you often work on the road without internet: The site also works offline.
AI Coffee Break with Letitia explains current papers and developments in artificial intelligence. If you like the videos: Letiţia also gives seminars and lectures here.
Sentdex produces many tutorial videos on Python, especially in the artificial intelligence topic area, as well as classifications and paper discussions. Among the videos a few years older are tutorials on algorithms, web development, and data analysis.
Siraj Raval proudly presents tutorials that try to bring current developments in research, such as reinforcement learning or GANs, to the viewer in a very simple way.
Python libraries
For many programming problems, solutions already exist in the form of a Python library. Some of them you will inevitably come across during your studies, others not. However, it is often worth knowing these.
Compare strings
fuzzywuzzy is a library that allows, for example, to match strings imprecisely (Levenshtein) and thus to filter them.
Recognize text language
langdetect is a useful library that allows to easily find out the language of a text.
Websites
flask is a rather lightweight web framework that leaves a lot of freedom. Tutron, for example, is based on it.
django is a rather heavyweight web framework that provides a reasonably fixed framework for a project.
Extract text from documents
textract can extract texts from documents like PDFs. Very handy, because this is not such a simple task.
Web Scraping
requests simplifies working with HTTP requests to pages.
beautifulsoup4 parses HTML and XML.
requests-html is like Requests and Beautifulsoup4 together on steroids – it supports JavaScript, for example, because it simply launches a full Chromium browser in the background. Handy for the big web scraping projects.
scrapy is a framework for writing scrapers.
Horny Shit & Gimmicks
jupyterlab allows you to directly connect code and documentation in the form of web-based, interactive Jupyter notebooks. Handy for demos, teaching, data science, …
termcolor allows to output colorful and formatted output to the terminal.
tqdm shows cool progress bars.
genanki programmatically creates Anki decks. Nothing directly related to Coli, but certainly has applications. For example, how about combining this library with requeests, beautifoulsoup4, textract and scikit-learn to automatically generate flashcards from slides uploaded in Moodle courses?
Statistics & Machine Learning
scipy implements very many algorithms for scientific work, like correlation tests or already implemented similarity measures like cosine similarity.
scikit-learn implements many machine learning algorithms.
sympy allows you to formulate mathematical formulas in Python, which can then be simplified, derived, integrated, or evaluated with numbers applied.
statsmodels contains many functions for statistical analysis of data, e.g. correlation.
pandas allows tabular data to be processed. However, the interfaces, which are based on the R programming language, take some getting used to.
Deep Learning
torch (= PyTorch) is a library for Machine Learning, which is used very often in NLP. You will certainly be using it during your studies.
keras is a framework for the very simple creation of neural networks. It works as an abstraction layer on top of Tensorflow or Theano.
Visualization
matplotlib is the classic visualization library for Python programs.
seaborn is an extension of matplotlib with much nicer defaults and automatic integration of pandas dataframes.
Word processing
nltk contains many algorithms for text processing.
spacy implements state of the art algorithms for POS tagging, dependency parsing and NER.
textblob contains algorithms for processing English texts, like an automatic sentiment analysis based on a dictionary.
textblob-de is like textblob, but for German.
polyglot includes algorithms such as POS tagging and sentiment analysis for over 100 different languages.
To borrow
- 1× Presenter in the pool locker
This page (or parts of it) were translated automatically using DeepL.