Simon Späti 🏔️ (@ssp.sh)
#dataengineering
Dad. Technical Author, Data Engineer. Data practitioner (20y) • Writing at ssp.sh since 2015. Focused on the craft of data engineering & storytelling. 📚 vault.ssp.sh • 📖 @dedp.online ❯ #dataengineering, #opensource, #writing, #obsidian, #neovim
5,164 followers 1,123 following 3,825 posts
view profile on Bluesky Simon Späti 🏔️ (@ssp.sh)
#dataengineering
Simon Späti 🏔️ (@ssp.sh) reply parent
The Tuxedo seems faster at this small file test than Framework Desktop 🤯. Not sure how representative the small test itself is, it's nice to know that this laptop outperforms one of the fastest desktops available.
Simon Späti 🏔️ (@ssp.sh) reply parent
I think DHH got inspired by you :) omarchy.org/workstations/
Simon Späti 🏔️ (@ssp.sh) reply parent
Same same, but different name. Which one do you like more? More information about all of it at the usual places (also check the backlinks and graphs for related notes).
Simon Späti 🏔️ (@ssp.sh)
What's your go-to #dataarchitecture? I like the data architecture below, going from `Staging -> Cleansing -> Core -> Marts`; classical architecture of a DWH. It's what we used as a consultancy from the very beginning. Very similar to the 2nd: `Bronze -> Silver -> Gold -> Platinum`.
Simon Späti 🏔️ (@ssp.sh) reply parent
haha, perfectly fine to use it. for sure :) I have to use it and am typing this on it right now. So they work. But there are definitely better. And now that I get used to it, they are already a tiny bit better :P
Simon Späti 🏔️ (@ssp.sh)
A collection of #Omarchy workstations, one of them are mine. Can you spot it? 😉 Check it out here: omarchy.org/workstations/
Simon Späti 🏔️ (@ssp.sh) reply parent
Yeah, the price is a little off-putting, but the ratings are 5 stars only. So I'll probably give it a go, as I use them every day. And if it's beautiful, it might lead to good things, or just an empty wallet haha
Simon Späti 🏔️ (@ssp.sh)
I just found this pic, and the physical notebook caught my eye. Apparently, they are called «Nuuna Not White» and come in different colors. They look stunning. Has anyone tried? I like physical books to write daily/weekly to-dos. And brainstorm. nuuna.com/notebook-not...
Simon Späti 🏔️ (@ssp.sh) reply parent
The battery is amazing so far. I spent the whole afternoon in the library, and at the end, it was still above 70%. The keyboard is crap, I'd say macbook low-travel quality. So it's a step backwards compared to Lenovo, but same as MacBook, and also gets lots of fingerprints all over again :(
Simon Späti 🏔️ (@ssp.sh) reply parent
Good point, I only have it for two days. I mainly bought it for the specs :P First impressions: display is very, very crisp, almost retina quality, better than my 4K monitor, it seems. The speed is amazing, I feel a difference in my Lenovo 32 GB and lower CPU.
Simon Späti 🏔️ (@ssp.sh) reply parent
Wow, amazing achievement, congrats! Looks incredible. Only a little jealous. :)
Simon Späti 🏔️ (@ssp.sh) reply parent
Stress testing it with data engineering projects. hellodata-be with 50s of containers, and ClickHouse and Rill :)
Simon Späti 🏔️ (@ssp.sh) reply parent
The journey continues.
Simon Späti 🏔️ (@ssp.sh) reply parent
Price ~1'600 EUR. It's insane what you get for that money, IMO.
Simon Späti 🏔️ (@ssp.sh)
Today's Office. New machine, same OS. Tuxedo Laptop, 128 GB RAM 🤯. Very smooth. #Omarchy
Simon Späti 🏔️ (@ssp.sh) reply parent
More below in case of interest, I also used WordPress in between.
Simon Späti 🏔️ (@ssp.sh) reply parent
I just noticed my ASCII/Character art, such as .::News::. I guess was already into that back then, so no-one can say it's because of AI, I did it long before AI 😉🙈
Simon Späti 🏔️ (@ssp.sh)
My website in 2005. I cropped an image I liked with Adobe Photoshop (I believe) and created an HTML page. Then I converted it into a template using PHP and only changed the content. Almost the same approach I use today with Hugo, except for Markdown & static. Otherwise, same same 20 years later!
Simon Späti 🏔️ (@ssp.sh)
Omarchy spotted! 🙃
Simon Späti 🏔️ (@ssp.sh) reply parent
@szarnyasg.org Great article. Love the #Omarchy vibes 😉, and Osaka Jade is a great choice. It is also great to see how performant DuckDB is on plain Linux. I will re-run the test on my upcoming Tuxedo 128 GB laptop. And if it is running Obsidian, I ported the Omarchy theme to Obsidian :)
Simon Späti 🏔️ (@ssp.sh)
Data modeling. www.ssp.sh/brain/data-m...
Simon Späti 🏔️ (@ssp.sh) reply parent
Also, related. bsky.app/profile/ssp....
Simon Späti 🏔️ (@ssp.sh) reply parent
In case you want to follow along with the latest acquisitions, I curate a list in my notes. Check out its announcements and related notes here.
Simon Späti 🏔️ (@ssp.sh) reply parent
The money stream should be independent of OSS. Much easier said than done, but there are still so many great DE companies I can think of that are doing a great job of doing exactly that. I hope it stays this way.
Simon Späti 🏔️ (@ssp.sh) reply parent
Think of the Framework laptop; it's fully repairable, with every part replaceable, meaning you can replace the screen, even the motherboard, later on. This is something I want to support. Same with OSS. I believe the strategy shouldn't be to cash out on OSS.
Simon Späti 🏔️ (@ssp.sh) reply parent
However, making money from open-source is hard, but I still hope that many will pursue this path. When deciding on a tool, I will always pick the open-source one. To me, it builds trust, and because it's shared as a gift for anyone to use, it makes me want to support it more.
Simon Späti 🏔️ (@ssp.sh) reply parent
I think the best way for OSS products to survive is to embrace the «Declarative Data Stack» approach, where integration happens with a single configuration file. If integrated with multiple tools, you get the best of both worlds: integrated and OSS, and end-to-end analytics.
Simon Späti 🏔️ (@ssp.sh)
Once it was called «Software is eating the world», now it seems the pendulum is swinging back to more unified and integrated data platforms.
Simon Späti 🏔️ (@ssp.sh) reply parent
Currently on the front page of Hacker News. Discuss there as well if you like: news.ycombinator.com
Simon Späti 🏔️ (@ssp.sh) reply parent
I hope you enjoy this. It was really fun to write it, and I learned a lot myself. One of which was that you can do simple ETL without the need for an ETL tool, but with engines that you already use, such as MergeTree from ClickHouse and Rill, orchestrating it out of the box.
Simon Späti 🏔️ (@ssp.sh) reply parent
Included is a list of battle-tested tips and tricks from Rill's years of implementing real-time analytics for customers across industries—from financial transactions and programmatic advertising to IoT telemetry. Please check the essay at the usual place: www.ssp.sh/blog/practic...
Simon Späti 🏔️ (@ssp.sh) reply parent
The article is all about: 1. Real-time analytics data flow: tradeoffs & payoffs 2. NOAA weather dashboard (58M rows) 3. @clickhouse.com modeling techniques 4. Demo + code + video of showcase with different options 5. Fast visualization with a declarative, code-first BI tool, @rilldata.com.
Simon Späti 🏔️ (@ssp.sh) reply parent
In my essay, I tackle these exact questions with a practical example that models data stored and updated on S3, ingests and aggregates it with ClickHouse, and utilizes Rill for visualization. Setup: ``` curl rill.sh | sh && rill start git@github.com:sspaeti/clickhouse-modeling-rill-example.git ```
Simon Späti 🏔️ (@ssp.sh)
Data modeling got on the back burner for way too long. But one that is even less written about is the modeling of real-time analytics, so-called OLAP cubes. How do we model them? Do we take a different approach to DWHs? At what level of granularity do we persist/cache? Common errors or practices?
Simon Späti 🏔️ (@ssp.sh) reply parent
I guess we need strong closed-source data platforms, like yours. If we want OSS to survive, we need declarative data stacks that work integrated with a set of configs. Almost plug and play. It's challenging, but many are doing this.
Simon Späti 🏔️ (@ssp.sh)
Consolidations in the data engineering market are happening fast. Tools from the modern data stack get unified into bigger data platforms. What's your take?
Simon Späti 🏔️ (@ssp.sh) reply parent
there's now this, not sure if that helps for your problem what hardware did you install it on? and yes, it's always dns :))
Simon Späti 🏔️ (@ssp.sh)
Omarchy mentioned (and demonstrated, including a full install!) at the Rails keynote. youtu.be/gcwzWzC7gUA?...
Simon Späti 🏔️ (@ssp.sh)
TIL—AWS credentials are needed (e.g., from environment variables, IAM roles, etc.) even for public buckets. Otherwise, you can go around with NOSIGN: ``` FROM s3( 's3://bucket/path/file.csv.gz', NOSIGN, -- Forces anonymous access 'CSV' ) ```
Simon Späti 🏔️ (@ssp.sh)
Which one is the best platform for creating a newsletter or members? What do you use? Here is a little overview that helps to choose. See more on www.ssp.sh/brain/open-s....
Retro Tech Dreams (@retrotechdreams.bsky.social) reposted
3D Pinball: Space Cadet for Windows XP
Simon Späti 🏔️ (@ssp.sh) reply parent
For the curious, here's how it's set up atm. Essentially, Markdown with shortcodes (gohugo.io/content-mana...) that Hugo provides.
Simon Späti 🏔️ (@ssp.sh) reply parent
It would be a condensed view of all my posts on social, of things I wrote, and learned. It's a way to quickly learn or get some updates without needing to be constantly online or on social media. Not sure yet, but curious what you think, or if there are similar versions of writer's room out there?
Simon Späti 🏔️ (@ssp.sh)
A quick brain dump of an idea I'm thinking of: My «Writer's Room». I'd like to share, in an easy-to-follow way, the latest things I've learned, as well as the things I'm pondering, mapped out in a timeline for you to follow along. Would that be something you are interested in 🤔?
Simon Späti 🏔️ (@ssp.sh) reply parent
is the ./clickhouse command the local chdb?
Simon Späti 🏔️ (@ssp.sh)
I created an #Omarchy feed, if anybody wants to follow along on Bluesky. Description: > Opinionated Arch distro created by DHH. Replacement for Windows on macOS that is aesthetically pleasing and heavily shortcut-oriented. Feed: bsky.app/profile/did:... GitHub project: github.com/basecamp/oma...
Simon Späti 🏔️ (@ssp.sh) reply parent
Just in case, I opened a PR. Maybe it will help someone else.
Simon Späti 🏔️ (@ssp.sh) reply parent
When closing the TUI in Omarchy, it will automatically shut down the windows and stop the Docker container as well.
Simon Späti 🏔️ (@ssp.sh)
I converted it into a TUI. Now, when I start Windows VM with my Launcher, it will automatically run the Docker image & RDP into the VM and opening a frameless Hyprland app. What a world. Check the short video below for the demo.
Simon Späti 🏔️ (@ssp.sh) reply parent
Take a look at the impressive work for the single-line setup of Windows below.
Simon Späti 🏔️ (@ssp.sh)
On #Omarchy, and quickly need to edit MS Word? Or any other Windows native app? Just run one command to have a running Windows, and a second to connect via RDC: ``` docker run -it --rm --name windows -p 8006:8006 .... rdesktop -u docker 127.0.0.1:3389 ```
Simon Späti 🏔️ (@ssp.sh)
Beautiful! Thank you @python.org.
Simon Späti 🏔️ (@ssp.sh) reply parent
ahh got it, sorry, i read it wrong.
Simon Späti 🏔️ (@ssp.sh)
Gotta love this Hill Chart history view for tracking the progress of my business. #basecamp
Simon Späti 🏔️ (@ssp.sh) reply parent
As always, related notes and more.
Simon Späti 🏔️ (@ssp.sh)
#Marketing is a way of transporting enthusiasm. It could be either good or bad enthusiasm, but usually, it is contagious and refreshing. Curated or manufactured enthusiasm or fake enthusiasm no longer works well. Real enthusiasm is coming back. Authenticity, genuine, especially in a world of AI.
Simon Späti 🏔️ (@ssp.sh) reply parent
It even got a reader mode where you can read each article directly in clx, in the terminal. Gotta love this tech.
Simon Späti 🏔️ (@ssp.sh)
Reading the HN comments with another TUI 😀. github.com/bensadeh/cir...
Simon Späti 🏔️ (@ssp.sh) reply parent
It's always a surprise to me. I don't write to be on HN, but I like it when more people get to read my writing. The comments are usually about the title only; only a few people who read the article comment, at least that's my impression 😆 It's not ruin my day, it's more a short adrenaline rush.
Simon Späti 🏔️ (@ssp.sh)
Currently, on Hacker News.
Simon Späti 🏔️ (@ssp.sh)
Will AI Replace Human Thinking? The Case for Writing and Coding Manually.
Simon Späti 🏔️ (@ssp.sh) reply parent
In Omarchy, you don't need anything except a URL, an Icon, and a name. No Gemini, Claude, required. But agreed, if you want to create something more advanced, I created a TUI, and it now lives as a native app in Omarchy.
Simon Späti 🏔️ (@ssp.sh) reply parent
Adding my own TUIs.
Simon Späti 🏔️ (@ssp.sh) reply parent
Thanks for sharing 🫶 Below is the process documented on Bsky.
Simon Späti 🏔️ (@ssp.sh) reply parent
Web apps 🫶
Simon Späti 🏔️ (@ssp.sh) reply parent
One downside: there is no drag-and-drop feature for uploading. I created a web app to do that.
Simon Späti 🏔️ (@ssp.sh) reply parent
Scripts for the curious (thanks to claude!) github.com/sspaeti/dotf...
Simon Späti 🏔️ (@ssp.sh)
I created my own TUI image OCR searcher for all my screenshots and recent images. I was missing the Snagit library of all my screenshots taken, and even OCR text search. Now with gum and fzf and preview with yazi, I have a much nicer and faster TUI. 🐧 Linux is beautiful. #Omarchy
Simon Späti 🏔️ (@ssp.sh) reply parent
Have you tried? it's pretty awesome
Simon Späti 🏔️ (@ssp.sh) reply parent
I had to double-check this 😅
Simon Späti 🏔️ (@ssp.sh)
Are you enjoying a nice theme on @obsidian.md? Take a look at the four I created. Really great to see that they get used. The latest one is Osaka Jade: bsky.app/profile/ssp.....
Simon Späti 🏔️ (@ssp.sh)
#databs
nixCraft (@cyberciti.biz) reposted
My Journey from macOS to Arch Linux with Omarchy www.ssp.sh/blog/macbook...
Simon Späti 🏔️ (@ssp.sh) reply parent
On Hacker News on the weekend.
Simon Späti 🏔️ (@ssp.sh) reply parent
This revives so many memories. Great one, thanks for sharing!
Simon Späti 🏔️ (@ssp.sh) reply parent
Funny to see the curve when I shared it 3 days ago, when it did nothing, but today it jumped to the front page. HN remains a mystery :)
Simon Späti 🏔️ (@ssp.sh)
It's always a great, happy surprise to randomly find out that one of my articles is on the front page of Hacker News. Today, it's my #omarchy article about moving away from macOS. Check it out at www.ssp.sh/blog/macbook... if you like #linux or using a MacBook.
Simon Späti 🏔️ (@ssp.sh) reply parent
As always, links and related thoughts on www.ssp.sh/brain/ideas-....
Simon Späti 🏔️ (@ssp.sh)
Ideas worth sharing are insights from having lived. You can't create great ideas out of thin air. You must have gone through a hard time, possibly lived in different places, or overcome obstacles. These all make great insights and ideas to share. Use the art of asking questions, and start sharing.
Simon Späti 🏔️ (@ssp.sh) reply parent
True, that's why I created an Obsidian theme too, and running it right now. Just in case :)
Simon Späti 🏔️ (@ssp.sh)
👀
Simon Späti 🏔️ (@ssp.sh) reply parent
I hope you enjoy. Happy to discuss further. Exciting times ahead for BI and for the semantic layer. PS: It's on the front page of Hackernews right now (30 position), but peaked yesterday evening :)
Simon Späti 🏔️ (@ssp.sh) reply parent
Think of the differentiation it this way: » dataset ≠ aggregations » table columns ≠ metrics » physical table ≠ logical definition If you find yourself needing the concepts on the right side, that's when you need a semantic layer, either for building into a BI tool or implemented separately.
Simon Späti 🏔️ (@ssp.sh) reply parent
Some Chapters and Insights: - When you DON'T need a Semantic Layer - Why use a semantic layer with the differentiation of «Datasets vs. Aggregations» - A practical example with DuckDB, Boring Semantic Layer (@hachej.bsky.social 👋), and Ibis. Building a DSL for our Metrics and KPIs. - SL FAQs
Simon Späti 🏔️ (@ssp.sh) reply parent
With an SL, your revenue KPI or other complex company measures are defined once in a single source of truth—no need to re-implement them over and over again. In my article, we'll look at the simplest possible SL, which uses a simple YAML file (semantics) and Python/Ibis (executing) & DuckDB.
Simon Späti 🏔️ (@ssp.sh)
Many ask themselves, «Why would I use a semantic layer? How to build one?». But a better question is: How many times have you implemented the same revenue calculation differently across your company's dashboards, reports, and apps? This is why semantic layers exist.
Simon Späti 🏔️ (@ssp.sh) reposted reply parent
Ok, the workflow tiling window comparison video after switching to #Omarchy is now uploaded. I took one shot, so bear with me. But in case you wonder, you see the difference between macOS and Arch Linux in action (opinionated on how I use it).
Simon Späti 🏔️ (@ssp.sh) reply parent
Ok, the workflow tiling window comparison video after switching to #Omarchy is now uploaded. I took one shot, so bear with me. But in case you wonder, you see the difference between macOS and Arch Linux in action (opinionated on how I use it).
Simon Späti 🏔️ (@ssp.sh) reply parent
Which one do you use? Kdenlive seems pretty OK so far.
Simon Späti 🏔️ (@ssp.sh) reply parent
Local ClickHouse --I just ran: ``` curl clickhouse.com | sh ./clickhouse server ./clickhouse client ``` They acquired chdb, but I'm not sure if it is running locally when I do this. Good questions.