99 lines
4.3 KiB
Markdown
99 lines
4.3 KiB
Markdown
---
|
|
title: The Final Newline
|
|
tags: article software
|
|
created: 2023-06-28T14:52:08Z
|
|
published: true
|
|
---
|
|
|
|
The beautiful thing about the language of mathematics is that it is precise and definitions must apply to every case
|
|
presented, 100%, else the definition is considered wrong and needs to be altered. 99% isn't good enough for a
|
|
definition. If there is even a single case showing that the proposed definition doesn't apply to it, we consider the definition to be incorrect.
|
|
|
|
I'm a little surprised that in software engineering, which must be as precise as mathematics in definitions and specifications, we have come to accept this definition of a line:
|
|
|
|
> A sequence of zero or more non-newline characters plus a terminating newline character.
|
|
|
|
As mentioned earlier, if there is even one case that doesn't satisfy the definition, then it's incorrect.
|
|
This definition of a line completely falls apart when you look at a file with the following contents:
|
|
|
|
```text
|
|
This is a line of text.
|
|
```
|
|
|
|
Let's save this as `single-line.txt`, without the newline character at the end.
|
|
|
|
Is this a line? Of course it is, humans don't use line terminators when writing something down in real life.
|
|
Ask any person around you, they will say "yeah, it's one line of text".
|
|
|
|
## Counting lines
|
|
|
|
Without a final newline character, POSIX compliant programs will fail to recognize this line:
|
|
|
|
```bash
|
|
> "This is a line of text." | wc -l
|
|
0
|
|
```
|
|
|
|
That's because the POSIX definition of a line is wrong. We have one line of text and the computer is telling us we have zero lines in our file.
|
|
|
|
Now, before you go and send pull requests to the authors of `wc`, let me quickly add: Their documentation explicitly mentions that the flag abbreviated with `-l` does - in fact - not count lines. 🤔
|
|
|
|
It only counts _newlines_. The authors of `wc`, Paul Rubin and David MacKenzie, probably wanted to avoid this
|
|
controversy and the documentation states that it only counts newline characters, not lines.
|
|
|
|
## Separator vs. Terminator
|
|
|
|
There are 2 ways to interpret the newline character:
|
|
|
|
- As a line separator
|
|
- As a line terminator
|
|
|
|
With the "line separator" interpretation you basically treat the contents of a file as a
|
|
|
|
```go
|
|
strings.Join(lines, "\n")
|
|
```
|
|
|
|
whereas the "line terminator" interpretation is better expressed as:
|
|
|
|
```go
|
|
for _, line := range lines {
|
|
w.WriteString(line)
|
|
w.WriteByte('\n')
|
|
}
|
|
```
|
|
|
|
## Definition vs. Regulation
|
|
|
|
As we have seen, not every line is terminated with a newline character.
|
|
Thus we cannot regard the "line terminator" interpretation as a _definition_, because it's incorrect.
|
|
It's a _regulation_.
|
|
|
|
A regulation is different because it forces you to do something as opposed to trying to define something.
|
|
|
|
## Does the POSIX regulation make sense?
|
|
|
|
The thing about regulations is that we all hate them, but we hate inconsistencies even more.
|
|
|
|
Even if I personally think that the newline character should be a line separator because it's more in line (pun intended) with how humans see a text file,
|
|
I also recognize that some fights aren't worth fighting.
|
|
|
|
It would have been a lot easier to discuss this topic in the early days of computing, before the existence of tools that adopted the POSIX standard.
|
|
Now it's hard to argue against it when your co-worker just wants an easy way to disable the annoying "No newline at end of file" warning on every `git diff`.
|
|
|
|
So my suggestion to developers is that everybody needs to decide for themselves what is worth their time.
|
|
|
|
If you say that you want to argue with existing standards and try to improve them for the future, then I will fully
|
|
respect that. But know what you're getting yourself into. Everything has an [opportunity cost](https://en.wikipedia.org/wiki/Opportunity_cost).
|
|
I think it's mostly the younger generation that tries to initiate changes and that's not necessarily a bad thing.
|
|
|
|
Deep down, I really wish somebody would stop this line terminating madness someday. But this isn't my fight.
|
|
|
|
## Conclusion
|
|
|
|
- In an ideal world, newline characters would be line separators, not line terminators
|
|
- Tools in the UNIX world have already widely adopted the POSIX regulation
|
|
- It's very hard to make a change in this area
|
|
- Decide for yourself if this is a fight that is worth your time [or not](https://en.wikipedia.org/wiki/Law_of_triviality)
|
|
- If you are not the owner, adopt the style regulations of the project you are contributing to
|