The leading thought

LLMs, Kubernetes, and cool stuff what not

I have been experimenting with LLMs for some time, and now it is the right moment to talk about it. My experience formed mainly through interactions with OpenAI ChatGPT. Very low barrier of entry. I used it for code generation, what else!

To write code means two fundamental things:

  1. To come up with a plan of what is it to be built, and roughly how.
  2. Execute the manual labor of typing the code.

The first part leads to an API: how do I communicate with the part of the world I am going to build? This is where types, access methods and APIs form. The second part is writing all that code responsible for flag parsing, configuration handling, setting up a logger, initializing various clients, servers, and so on. Regardless how abstracted it may be. If you write code, you get the drift. There’s a reason why people have been doing scaffolding for decades.

It would be great if an LLM could do all that initial work. Propose a structure, and provide an initial implementation. With my experiments, I found out that an LLM was able to give me a good enough first version of the code. Something I could take from it, understand what it does, build on top of it. I don’t like working iteratively on the code for a long time with an LLM. I take the first acceptable solution. Maybe the first acceptable solution required a few iterations.

I mean, judge for yourself:

Prompt: write a Golang IMAP client, list emails from a mailbox folder, allow listing emails from a certain date

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
package main

import (
	"crypto/tls"
	"flag"
	"fmt"
	"log"
	"os"
	"strings"
	"time"

	imap "github.com/emersion/go-imap"
	"github.com/emersion/go-imap/client"
)

// Version is printed with --version
const Version = "1.0.0"

func main() {
	var (
		server   = flag.String("server", "", "IMAP server address in host:port format (e.g. imap.example.com:993)")
		username = flag.String("username", "", "IMAP username")
		password = flag.String("password", "", "IMAP password (or set IMAP_PASSWORD env var)")
		mailbox  = flag.String("mailbox", "INBOX", "Mailbox folder to open")
		sinceStr = flag.String("since", "", "Only list messages SINCE this date (YYYY-MM-DD)")
		useTLS   = flag.Bool("tls", true, "Use implicit TLS (port 993)")
		startTLS = flag.Bool("starttls", false, "Use STARTTLS (usually on port 143). Mutually exclusive with --tls")
		limit    = flag.Int("limit", 200, "Maximum number of messages to fetch when --since is not provided (from most recent)")
		verbose  = flag.Bool("v", false, "Verbose logging")
		version  = flag.Bool("version", false, "Print version and exit")
	)

	flag.Parse()

	if *version {
		fmt.Println("imap-list", Version)
		return
	}

	if *server == "" || *username == "" {
		fmt.Fprintln(os.Stderr, "error: --server and --username are required")
		flag.Usage()
		os.Exit(2)
	}
	pass := *password
	if pass == "" {
		pass = os.Getenv("IMAP_PASSWORD")
	}
	if pass == "" {
		fmt.Fprintln(os.Stderr, "error: provide --password or set IMAP_PASSWORD env var")
		os.Exit(2)
	}
	if *useTLS && *startTLS {
		fmt.Fprintln(os.Stderr, "error: --tls and --starttls are mutually exclusive")
		os.Exit(2)
	}

	log.SetFlags(0)
	if !*verbose {
		log.SetOutput(new(discardLogger))
	}

	// Connect
	var c *client.Client
	var err error
	if *useTLS {
		// Implicit TLS (recommended)
		c, err = client.DialTLS(*server, &tls.Config{ServerName: hostOnly(*server)})
	} else {
		c, err = client.Dial(*server)
	}
	check(err, "dial IMAP server")
	defer c.Logout()

	if *startTLS {
		// Upgrade connection
		err = c.StartTLS(&tls.Config{ServerName: hostOnly(*server)})
		check(err, "STARTTLS failed")
	}

	// Login
	err = c.Login(*username, pass)
	check(err, "login failed")

	// Select mailbox
	mbox, err := c.Select(*mailbox, true)
	check(err, "select mailbox")

	if mbox.Messages == 0 {
		fmt.Println("(no messages)")
		return
	}

	// Decide strategy: search by date or fetch last N
	var uids []uint32
	if *sinceStr != "" {
		// Parse YYYY-MM-DD
		since, err := time.Parse("2006-01-02", *sinceStr)
		if err != nil {
			fatalf("invalid --since date %q (expected YYYY-MM-DD)", *sinceStr)
		}
		criteria := imap.NewSearchCriteria()
		criteria.Since = since
		uids, err = c.UidSearch(criteria)
		check(err, "search since date")
		if len(uids) == 0 {
			fmt.Printf("No messages since %s in %s\n", since.Format("2006-01-02"), *mailbox)
			return
		}
	} else {
		// Fetch last N UIDs using mailbox status
		seqset := new(imap.SeqSet)
		// Build a sequence range for the last N messages by message sequence numbers
		var from uint32 = 1
		if mbox.Messages > uint32(*limit) {
			from = mbox.Messages - uint32(*limit) + 1
		}
		seqset.AddRange(from, mbox.Messages)
		// Convert MSNs to UIDs
		uids, err = fetchUIDs(c, seqset)
		check(err, "resolve UIDs for last messages")
	}

	// Fetch message data by UID in chunks to avoid over-fetching
	const chunk = 500
	for i := 0; i < len(uids); i += chunk {
		end := i + chunk
		if end > len(uids) {
			end = len(uids)
		}
		uidset := new(imap.SeqSet)
		for _, u := range uids[i:end] {
			uidset.AddNum(u)
		}

		items := []imap.FetchItem{imap.FetchEnvelope, imap.FetchInternalDate, imap.FetchRFC822Size}
		messages := make(chan *imap.Message, 50)
		go func() {
			if err := c.UidFetch(uidset, items, messages); err != nil {
				log.Printf("fetch error: %v", err)
			}
		}()

		for msg := range messages {
			printMsgLine(msg)
		}
	}
}

// fetchUIDs converts a sequence set of MSNs to their UIDs by fetching the UID item.
func fetchUIDs(c *client.Client, seq *imap.SeqSet) ([]uint32, error) {
	messages := make(chan *imap.Message, 50)
	items := []imap.FetchItem{imap.FetchUid}
	var out []uint32
	go func() {
		_ = c.Fetch(seq, items, messages)
	}()
	for msg := range messages {
		out = append(out, msg.Uid)
	}
	return out, nil
}

func printMsgLine(msg *imap.Message) {
	if msg == nil || msg.Envelope == nil {
		return
	}
	from := ""
	if len(msg.Envelope.From) > 0 {
		from = addrToString(msg.Envelope.From[0])
	}
	subj := sanitize(msg.Envelope.Subject)
	date := ""
	if msg.InternalDate != nil {
		date = msg.InternalDate.Format(time.RFC3339)
	}
	fmt.Printf("UID=%d | %s | %s | %s | %d bytes\n", msg.Uid, date, from, subj, msg.Size)
}

func addrToString(a *imap.Address) string {
	name := sanitize(a.PersonalName)
	mail := ""
	if a.MailboxName != "" && a.HostName != "" {
		mail = fmt.Sprintf("%s@%s", a.MailboxName, a.HostName)
	}
	if name != "" {
		return fmt.Sprintf("%s <%s>", name, mail)
	}
	return mail
}

func sanitize(s string) string {
	s = strings.ReplaceAll(s, "\n", " ")
	s = strings.ReplaceAll(s, "\r", " ")
	s = strings.TrimSpace(s)
	return s
}

func hostOnly(hostport string) string {
	if i := strings.IndexByte(hostport, ':'); i >= 0 {
		return hostport[:i]
	}
	return hostport
}

func check(err error, what string) {
	if err != nil {
		fatalf("%s: %v", what, err)
	}
}

func fatalf(format string, args ...any) {
	fmt.Fprintf(os.Stderr, format+"\n", args...)
	os.Exit(1)
}

type discardLogger struct{}

func (d *discardLogger) Write(p []byte) (int, error) { return len(p), nil }

Now, I do not know much about IMAP to be the jury on the quality case. To this day I haven’t used that code, nor I tried understanding it. But somehow this example speaks to me. I would accept it as my starting point, I take over from here. However, I would still enjoy something helping me write the next few lines. Something like GitHub Copilot is incredibly useful. It is easier to control the change, easier to make a mental note, like in old-school manual programming. Why? The that Copilot-like solution materializes the thoughts we already formed. The code it writes we have already seen, noted, conceptualized, and finally accepted. All in a shot time span, but the process of physically typing every letter has been eliminated. It’s more than predictive typing because it understands the code and context.

I observe significant number of companies embracing Copilot. Something Copilot-like but hosted locally would be nice to have. I think that Tiny-LLM would be a good fit.

As that was happening, earlier this year Klarrio organized two very good thought-provoking internal LLM-oriented hackatons. Those cemented my view that the two-step approach is the right one to embrace.

  1. Take an acceptable first foundation from an LLM.
  2. Build on it with something more contextual, Copilot-like.

I have done many more interesting experiments. Some front end JavaScript web components were pretty impressive. Something I would never expect a junior engineer to come up within seconds. I had it build color pickers attachable to an arbitrary DOM node, draggable and resizable areas embedded into PDF.js with position, size, offset calculation, all sorts of crazy stuff that just worked straight copy-paste. It would write container files, make files, bash scripts, Golang, Python.

My experience with other models yields promising results.

Running a software company means that sometimes one needs to write internal software. There was never enough time, there is always other stuff going on.

the leading thought
have an additional pair of hands to automate the process of writing
and let me focus on conceptualizing

That thought led to other thoughts that led to the decision to get proper equipment for real-world use cases and make it available to everyone at Klarrio GmbH. For various reasons I have decided to go for a Mac Studio with 32-core M3 Ultra CPU, M3 Ultra 80-core GPU with Metal 3 support and 512 GB LPDDR5 RAM.

The machine was lying unused for some time. I finally had an opportunity, and some motivation for good measure, to work on it.

So far there’s ollama with Open WebUI, speech-to-text, text-to-speech, Jupyter Notebook code runner, and Stable Diffusion. Everything inside self-hosted Headscale VPN running in Kubernetes on Hetzner, with valid Let’s Encrypt certificates provided by cert-manager and its Route53 DNS-01 solver for the internal network. All DNS names for HTTP-based services inside the VPN can be served TLS-encrypted.

It took a week from a concept to a working solution. And surprisingly it wasn’t that much plumbing.

I’ll write some more soon.