The correct usage of scanf to read a single integer from the user.
Every part of it explained — what scanf is, why &a is required,
what %d does, and why you deliberately left out \n on the prompt.
scanf("%d", &a) — reading an integer from stdin into a local variable
// We are going to get user input! Interesting. // scanf : function in the standard input output lib. // Users can provide NON SPACE SEPARATED INPUT. #include <stdio.h> int main(void) { int a; printf("Enter a number : "); // not using \n — for input visual purposes scanf("%d", &a); // &a : the ADDRESS of a. scanf writes to it. // similar to borrowing in Rust, except raw pointer printf("You entered this number groot : %d\n", a); return 0; }
❯ ./scanfv1 Enter a number : 234 You entered this number groot : 234
What scanf is — and why it is the opposite of printf
scanf is the input counterpart to printf:
Both are variadic functions declared in stdio.h and implemented in libc.
printf takes a format string and a list of values — it reads them
and writes formatted text to stdout. scanf takes a format string and a list
of pointers — it reads formatted text from stdin and writes parsed values into
the memory locations the pointers refer to.
Why scanf takes pointers and printf takes values:
In C, function arguments are passed by value — a copy is made. If scanf took
int a directly, it would receive a copy. Writing to the copy would change
nothing in your main. To modify a variable in the caller's scope, you must
give the callee the variable's address. scanf receives &a, dereferences
it, and writes the parsed integer directly into the memory location where a
lives on your stack. When scanf returns, a in main has the new
value.
Your Rust comparison was accurate:
"similar to borrowing a variable in Rust, except here we call it using the actual
name: reference." In Rust, &mut a creates a mutable reference —
a compile-time-checked pointer with borrow rules enforced by the compiler. In C,
&a creates a raw pointer — the address of a, with no
safety guarantees. The concept is identical (give the callee a way to write into your
variable), but Rust enforces the rules at compile time while C trusts you completely.
What %d tells scanf to do:
The format specifier %d instructs scanf to: (1) skip any leading whitespace
bytes (spaces, tabs, newlines — 0x20, 0x09, 0x0A),
(2) read an optional minus sign, (3) read consecutive digit characters and accumulate them
into an integer value, (4) stop at the first byte that is not a digit, (5) convert the
accumulated characters to a 32-bit signed integer, (6) write it to the int *
argument provided. The non-matching byte that terminated parsing is left in stdin for the
next read.
printf without \n — the deliberate choice:
printf("Enter a number : ") leaves the cursor at the end of the colon and
space. The user types their input right there on the same line — which is the natural
feel of an inline prompt. Adding \n would move the cursor to the next line,
making the prompt and the input appear on separate lines. The subtle risk: stdout is
line-buffered by default when connected to a terminal. Without \n, the
prompt text might stay in the buffer and not appear before scanf blocks. In practice this
works because scanf triggers a flush when it needs to read. For absolute safety, add
fflush(stdout); after the prompt printf.
What scanf returns — the thing you should always check:
scanf returns the number of items successfully matched and assigned as an int.
For scanf("%d", &a): returns 1 on success, 0
if the input did not match %d at all (typing letters, just Enter, etc.),
and EOF (-1) if the input stream ended (Ctrl+D on Linux) before any
conversion. In your code the return value is ignored. Production code always checks:
if (scanf("%d", &a) != 1) { /* handle parse failure */ }. Without this
check, a failed parse leaves a uninitialized and the program continues
silently with garbage data.
Every edge case you listed in the source comments, run and recorded in
scanfv1.output. Each result explained at the level of what scanf's
internal parsing actually did.
| Input | Output | Verdict | What scanf did |
|---|---|---|---|
| 234 | 234 | ✓ correct | Skipped no whitespace. Read digits 2, 3, 4. Hit EOF. Converted to 234. Wrote to &a. Returned 1. |
| -234 | -234 | ✓ correct | Read leading - (valid for %d). Read digits 2, 3, 4. Converted to -234. Negative integers work fine. |
| 234 256436 24562 | 234 | ✓ expected | %d stops at whitespace. Read 234, hit the space, stopped. The remaining " 256436 24562\n" stayed in the stdin buffer — unread, available for the next scanf call. |
| -234 -436 -0 | -234 | ✓ expected | Same as above. First token only. The space after -234 terminated parsing. Remaining input unread. |
| (Enter key only) | waits... then accepts 234 | ⚠ blocks | %d skips leading whitespace — newline (0x0A) is whitespace. Pressing Enter gives scanf a byte to discard, not one to parse. scanf went back to blocking on read(). Only a digit or non-whitespace breaks the loop. It never silently accepts empty input. |
| 290375398759087210985710987 | -1 | ⚠ overflow | scanf uses strtol() internally. The value exceeded LONG_MAX (9,223,372,036,854,775,807 on x86_64). strtol saturated at LONG_MAX (0x7FFFFFFFFFFFFFFF) and set errno=ERANGE. scanf assigned LONG_MAX to your int a — a 64-bit value truncated to 32 bits: 0xFFFFFFFF = -1 in two's complement. Undefined behavior, but -1 is what this implementation produced. |
| -2347896587369782134687693876 | 0 | ⚠ overflow | Huge negative — strtol saturated at LONG_MIN (0x8000000000000000). Truncated to 32-bit int: 0x00000000 = 0. Different saturation value than the positive case, different truncation result. |
| lskadfjalk | 0 | ⚠ parse fail | First character 'l' is not a digit or minus sign. scanf failed to match %d. Returned 0 (zero successful conversions). Did not write to &a. a was uninitialized — its stack slot happened to contain zero bytes (OS zero-fills fresh pages). Undefined behavior printed as 0. |
| -kjafda | 0 | ⚠ parse fail | Leading - is syntactically valid for a negative integer. scanf consumed it, then saw 'k' — not a digit. No digits were read after the minus. scanf produced a matching failure. Same result as pure string input. |
| 234.234 | 234 | ✓ truncated | %d parsed digits 2, 3, 4. Hit the '.' — not a digit. Stopped. Returned 234. The ".234" remains in stdin. Float input is always truncated toward zero by %d. |
| 0324.2342 | 324 | ✓ decimal | Leading zero is NOT treated as octal — octal interpretation only happens with integer literals in C source code (e.g. int x = 0324;), not with scanf input. scanf always reads decimal for %d. Read 0324 as decimal 324, stopped at '.'. |
| -23049.23452352 | -23049 | ✓ truncated | Negative float. Read '-', then digits until '.'. Returned -23049. The fractional ".23452352" stays in stdin. Truncates toward zero, not toward negative infinity: -23049.7 would still give -23049, not -23050. |
Deep Explanation — stdin buffering, what stays after scanf, and overflow internals
The stdin buffer and what "unread input" means:
When you type at the terminal and press Enter, the entire line — including the newline —
goes into a kernel-level line buffer, then into libc's stdio buffer for stdin.
When scanf calls read(0, buf, n) internally, the kernel delivers all of that
buffered data at once. If scanf only consumes part of it (stopping at a space or a period),
the rest stays in libc's buffer. The next call to any stdin-reading function —
scanf, getchar, fgets — immediately reads from the
leftover buffer without blocking for new input. This is the source of a very common bug in
C programs that alternate between reading integers and reading characters: the leftover
newline from a previous read gets consumed by the next getchar() call,
skipping the intended input entirely.
Why the Enter key blocks scanf instead of submitting empty:
The newline character (0x0A) is whitespace, and %d skips all leading
whitespace. There is no concept of "empty integer input" in scanf — it will keep skipping
whitespace indefinitely until it finds a non-whitespace character to parse. If you need
to handle empty input gracefully, use fgets() to read a whole line, then
parse it with sscanf() or check if it is empty before parsing.
Integer overflow — the strtol internals:
scanf's %d uses strtol() internally to do the actual parsing.
strtol reads the character sequence and accumulates the value as a
long (64-bit on x86_64). When the accumulated value would exceed
LONG_MAX, strtol clips it to LONG_MAX (0x7FFFFFFFFFFFFFFF)
and sets errno = ERANGE. When it would go below LONG_MIN,
it clips to LONG_MIN (0x8000000000000000). scanf then assigns this saturated
long value to your int * argument — implicitly truncating from
64 to 32 bits. LONG_MAX truncated: 0x7FFFFFFFFFFFFFFF → low
32 bits = 0xFFFFFFFF = -1. LONG_MIN truncated:
0x8000000000000000 → low 32 bits = 0x00000000 = 0. That is
exactly what you observed.
The octal observation for 0324:
In C source code, an integer literal starting with 0 is interpreted as octal.
int x = 0324; sets x to 212 decimal (3×64 + 2×8 + 4). But this
interpretation happens at compile time by the lexer — it only applies to literal values
written in source code. scanf reads characters at runtime and converts them as decimal for
%d regardless of leading zeros. Use %o if you want scanf to
parse octal input.
if (scanf("%d", &a) != 1) { fprintf(stderr, "bad input\n"); return 1; }
is the minimal correct pattern going forward.
int a; was commented out. &a still used.
Compile-time error. You also questioned the compiler's deduplication note —
this tab explains exactly what it means.
'a' undeclared — and the compiler's deduplication note
int main(void) { //int a; ← commented out. a no longer exists. printf("Enter a number : "); scanf("%d", &a); // ← 'a' was never declared printf("You entered this number groot : %d\n", a); return 0; }
scanfv2.c: In function 'main': scanfv2.c:11:16: error: 'a' undeclared (first use in this function) 11 | scanf("%d", &a); | ^ scanfv2.c:11:16: note: each undeclared identifier is reported only once for each function it appears in
What "undeclared" means at the compiler level
The symbol table:
The C compiler maintains a symbol table — a data structure mapping names to their
type, storage class, and location. When you write int a;, the compiler
adds an entry: name = "a", type = int, storage = stack frame, offset = (calculated
by the compiler). Every subsequent reference to a is looked up in this
table to determine its type (for type-checking) and location (to generate the correct
machine code instruction).
What happens when the declaration is missing:
When the compiler encounters &a on line 11, it looks up "a" in the
symbol table. There is no entry — you commented out the declaration. The compiler
cannot produce the address of something that does not exist in its symbol table. It
cannot determine the type of a, so it cannot verify that &a
is a valid int * argument for scanf's %d. It has nothing to
work with. It errors.
Answering your question about the deduplication note:
"each undeclared identifier is reported only once for each function it appears in —
What? So if we are going to use multiple &a, will those not be reported?"
Correct — that is exactly what the note says. If you used &a ten times
in main, gcc would report the undeclared error only at the first occurrence,
on line 11. The remaining nine uses would be silently skipped in the error output.
Why gcc deduplicates: Once gcc knows "a" is undeclared, every subsequent use of "a" in the same function is a direct consequence of the exact same root cause — the missing declaration. Reporting it ten times would produce ten near-identical error lines, bury other unrelated errors below them, and force you to scroll past the noise to find other problems. By reporting once and noting the deduplication policy, gcc is telling you: "fix the declaration, and all ten references are fixed simultaneously." This is the same cascade-suppression philosophy you saw in 1.2's fartocelv1, where one missing semicolon produced a waterfall of "undeclared" errors for every variable in the declaration.
Why this is a hard compile error and not a warning:
Using an undeclared name is an unrecoverable situation. The compiler cannot guess what
type a was supposed to be. In K&R C (pre-C99), undeclared names in
some contexts were implicitly treated as int — the infamous "implicit int"
rule. C99 removed this. Under -std=c11, every name must be declared before
use. No declaration, no code generation.
&a was removed. scanf("%d") was called with no destination pointer.
Compiled without -Wall — no warnings. Ran. Segfaulted immediately on integer input.
Survived on string input. You asked: "is that a segmentation error? SEGV?" — this tab answers fully.
Compiled silently — crashed at runtime — fish reported SIGSEGV
int main(void) { int a; printf("Enter a number : "); scanf("%d"); // ← &a removed. no destination pointer provided. printf("You entered this number groot : %d\n", a); return 0; }
❯ gcc -o scanfv3 scanfv3.c (no output — compiled silently. no -Wall, no -Wformat check.) ❯ ./scanfv3 Enter a number : 234234 fish: Job 1, './scanfv3' terminated by signal SIGSEGV (Address boundary error) ❯ ./scanfv3.e.bin Enter a number : sf You entered this number groot : 0 ← string input: no crash! ❯ ./scanfv3.e.bin Enter a number : 234.234 fish: Job 1, './scanfv3.e.bin' terminated by signal SIGSEGV (Address boundary error) ❯ ./scanfv3.e.bin Enter a number : 82375902837523857023985709847982348234932478324789234789234 fish: Job 1, './scanfv3.e.bin' terminated by signal SIGSEGV (Address boundary error)
What SIGSEGV is — from the x86_64 ABI to the kernel signal
What happens without &a — the ABI perspective:
On x86_64, variadic function arguments are passed in registers first (rdi, rsi, rdx, rcx,
r8, r9 for integer/pointer arguments), then on the stack if there are more. When you call
scanf("%d"), the compiler puts the address of the format string into
rdi (first argument). That is it. There is no second argument. scanf receives
the format string, parses it, finds %d, and internally reaches for the
pointer it expects — the second argument. It reads the value that happens to be in
rsi at that moment. rsi was not set by your call — it contains
whatever value it held from the last function call before scanf. That is a random 64-bit
value. scanf treats it as a memory address and attempts to write the parsed integer there.
Why writing to a random address causes SIGSEGV:
Every process on Linux runs with virtual memory. The kernel maintains a page table mapping
virtual addresses to physical memory frames. Only addresses that are mapped in the page table
are valid to read or write. When scanf issues a store instruction to the garbage address
in rsi, the CPU's Memory Management Unit (MMU) walks the page table, finds no
valid mapping for that address (or finds a read-only mapping), and raises a page fault
exception. The kernel's page fault handler examines the fault: can it be resolved? (No —
there is no mapping, or the mapping is read-only, and this was not a legitimate access.) It
sends signal 11 — SIGSEGV — to the process. The default handler for SIGSEGV
terminates the process immediately. Fish reports what happened.
SIGSEGV name and history: "Segmentation violation." The name comes from an older memory model where a process's address space was divided into hardware segments — code segment, data segment, stack segment. Accessing memory outside the bounds of any valid segment triggered a "segmentation violation." On modern x86_64 Linux, hardware segmentation is mostly not used (segments are set to cover the full address space), and the protection is done entirely by the MMU's page tables. But the signal name and its conventional meaning — "you accessed memory you do not own" — persist unchanged.
Why string input ("sf") did not crash — the crucial difference:
When you typed sf, scanf tried to match %d. The first character
is s — not a digit, not a minus sign. scanf immediately detected a matching
failure and returned 0. It never attempted to write anything. No store instruction
was issued to the garbage pointer. No memory access, no page fault, no signal. The program
continued to the next printf, printed a (uninitialized, stack happened to be
zero), and exited normally.
Why float input (234.234) crashed when string input did not:
234.234 starts with 2 — a valid digit. scanf successfully parsed
the integer part 234, then attempted to write it to the garbage pointer. That
write triggered the page fault. This is the exact difference: string input hits a matching
failure before any write occurs; float input passes the matching phase and only fails at
the write. Same explanation for the huge overflow number — parsing succeeded (or saturated),
write was attempted, crash.
Why it compiled silently without -Wall:
In C, calling a variadic function with fewer arguments than the format string expects is
not a syntax error — it is a runtime semantic error. The compiler cannot statically check
variadic argument counts without help. That help comes from the
__attribute__((__format__(__scanf__, 1, 2))) annotation on scanf's declaration
in stdio.h, and from the -Wformat diagnostic pass, which is
activated by -Wall. You compiled with gcc -o scanfv3 scanfv3.c —
no flags. The format-checking pass never ran. With -Wall, gcc would have
warned: warning: format '%d' expects a matching 'int *' argument [-Wformat=].
This would have caught the bug before you ran the program. This is the exact argument for
always using -Wall -Wextra -Wpedantic.
Your note: "see Memory-1, where I will use objdump -x":
That is the right next step. objdump -x ./scanfv3 will show you the ELF
section headers and the memory map of the binary — where the code, data, rodata, and bss
sections live. Running the broken binary under Valgrind with
--track-origins=yes would show you which register contained the garbage
pointer and trace it back to where that value originated. Combining objdump's static view
with Valgrind's dynamic trace gives you a complete picture of what the memory looked like
at the moment of the crash.
| Question | Answer |
|---|---|
Why does scanf need &a and not a? |
scanf writes a value into a memory location. It needs an address. a is a value; &a is the address of a on your stack. Without &, scanf reads garbage from the register and tries to write to that address — SIGSEGV. |
What does %d do with float input like 234.234? |
Reads digits until the first non-digit. The . stops parsing. Returns 234. The .234 stays in stdin. Truncates toward zero — not toward negative infinity. |
| Why did string input not crash scanfv3, but integer input did? | String input causes a format mismatch — scanf fails before writing. No write = no garbage-pointer dereference = no SIGSEGV. Integer input succeeds in parsing, then writes to the garbage pointer — crash. |
| What does scanf return, and why does it matter? | Count of successfully assigned items: 1 on success, 0 on mismatch, EOF (-1) on end-of-stream. Always check it. Ignoring it means a failed parse leaves variables uninitialized and the program continues silently with garbage. |
| Why did the broken scanfv3 compile without error? | Missing variadic arguments are not a syntax error — only detectable with -Wformat (part of -Wall). Compiled without any flags: no check ran. Always use -Wall -Wextra -Wpedantic -std=c11. |
| What is SIGSEGV? | Signal 11. Sent by the kernel when a process accesses a virtual address with no valid page-table mapping, or a read-only mapping on a write. The MMU raises a page fault; the kernel cannot resolve it; it kills the process with SIGSEGV. |
&.
Always. scanf("%d %d", &a, &b) — two ints, two addresses.
The only exception is char[] buffers — they already decay to a pointer and do
not take &. That case comes later when strings are covered.