Docs

Strings

String Basics in C

Table of Contents

  1. •Introduction
  2. •What is a String in C?
  3. •String Declaration and Initialization
  4. •String Memory Layout
  5. •String Input and Output
  6. •Accessing Individual Characters
  7. •String Length
  8. •Common String Operations
  9. •String Literals
  10. •Best Practices
  11. •Summary

Introduction

Strings are one of the most commonly used data types in programming. In C, strings have a unique implementation that differs from many other programming languages. Understanding how C handles strings is essential for:

  • •Text processing and manipulation
  • •User input/output handling
  • •File operations
  • •Data parsing
  • •Network programming

What is a String in C?

Definition

In C, a string is an array of characters terminated by a null character ('\0').

char str[] = "Hello";

This creates:

str[0] = 'H'
str[1] = 'e'
str[2] = 'l'
str[3] = 'l'
str[4] = 'o'
str[5] = '\0'   ← Null terminator (marks end of string)

The Null Terminator

The null terminator ('\0') is crucial because:

  1. •Marks the end - Functions know where the string ends
  2. •ASCII value 0 - Not the character '0' (which is ASCII 48)
  3. •Automatic - Added automatically for string literals
  4. •Takes 1 byte - Must be accounted for in memory allocation
// The null character
'\0'  == 0          // True! ASCII value is 0
'\0'  == '0'        // False! '0' is ASCII 48
'\0'  == ""[0]      // True! Empty string has only null terminator

Why Null Termination?

C strings are null-terminated because:

  1. •No length stored - Unlike other languages, C doesn't store string length
  2. •Efficiency - No need to pass length to every function
  3. •Simplicity - Easy to iterate until '\0' is found
  4. •Flexibility - String length can be determined at runtime

String Declaration and Initialization

Method 1: Character Array with Size

char str[10] = "Hello";

// Memory layout:
// [H][e][l][l][o][\0][ ][ ][ ][ ]
//  0  1  2  3  4   5  6  7  8  9
// Remaining positions are initialized to '\0'

Method 2: Character Array without Size

char str[] = "Hello";

// Compiler automatically allocates 6 bytes (5 chars + 1 null)
// [H][e][l][l][o][\0]
//  0  1  2  3  4   5

Method 3: Pointer to String Literal

char *str = "Hello";

// str points to a string literal (read-only memory)
// WARNING: Modifying this string is undefined behavior!

Method 4: Character by Character

char str[6];
str[0] = 'H';
str[1] = 'e';
str[2] = 'l';
str[3] = 'l';
str[4] = 'o';
str[5] = '\0';  // Don't forget the null terminator!

Method 5: Using Braces (Less Common)

char str[] = {'H', 'e', 'l', 'l', 'o', '\0'};

// Must manually include null terminator!
// Without '\0', it's just a char array, not a string

Comparison of Methods

MethodSyntaxModifiableSize
Array with sizechar s[10] = "Hi";YesFixed (10)
Array auto-sizedchar s[] = "Hi";YesAuto (3)
Pointer to literalchar *s = "Hi";NoPointer size
Character initchar s[] = {'H','i','\0'};YesAuto (3)

String Memory Layout

Visual Representation

String: "Hello"

Memory Address:  1000  1001  1002  1003  1004  1005
                ā”Œā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”
                │ H  │ e  │ l  │ l  │ o  │ \0 │
                ā””ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”˜
ASCII Values:    72   101  108  108  111   0

char str[] = "Hello";
str points to address 1000
str[0] = 'H' (at 1000)
str[5] = '\0' (at 1005)

Array vs Pointer

char arr[] = "Hello";   // Array - copies string to stack
char *ptr = "Hello";    // Pointer - points to read-only data

// Key differences:
sizeof(arr)  // 6 (includes null terminator)
sizeof(ptr)  // 4 or 8 (pointer size)

arr[0] = 'J';  // OK - modifying stack memory
ptr[0] = 'J';  // UNDEFINED BEHAVIOR - read-only memory!

Stack vs Data Segment

ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│           STACK                     │
│  char arr[] = "Hello";              │
│  [H][e][l][l][o][\0]  ← Modifiable  │
ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
│           HEAP                      │
│  (dynamically allocated strings)    │
ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
│       DATA SEGMENT (Read-Only)      │
│  "Hello" ← String literals          │
│  char *ptr = "Hello"; points here   │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

String Input and Output

Output Functions

printf() with %s

char str[] = "Hello, World!";
printf("%s\n", str);        // Output: Hello, World!
printf("%.5s\n", str);      // Output: Hello (first 5 chars)
printf("%10s\n", str);      // Right-aligned in 10 spaces
printf("%-10s\n", str);     // Left-aligned in 10 spaces

puts()

char str[] = "Hello";
puts(str);  // Outputs: Hello (automatically adds newline)

putchar() (single character)

putchar('H');
putchar('\n');

Input Functions

scanf() with %s

char name[50];
printf("Enter name: ");
scanf("%s", name);  // No & needed - array name is already a pointer

// WARNING: scanf stops at whitespace!
// Input: "John Doe" → name = "John"

Safe scanf with width specifier

char name[50];
scanf("%49s", name);  // Read at most 49 chars (leave room for \0)

gets() - DANGEROUS (Removed in C11)

// NEVER USE gets() - no buffer overflow protection!
gets(str);  // DANGEROUS!

fgets() - Safe Input

char str[100];
fgets(str, sizeof(str), stdin);

// Reads up to sizeof(str)-1 characters
// Includes newline if present
// Always null-terminates

Removing Newline from fgets()

char str[100];
fgets(str, sizeof(str), stdin);

// Method 1: Replace newline
str[strcspn(str, "\n")] = '\0';

// Method 2: Manual
size_t len = strlen(str);
if (len > 0 && str[len-1] == '\n') {
    str[len-1] = '\0';
}

Comparison of Input Methods

FunctionBuffer SafeReads SpacesIncludes \nUse Case
scanf %sNo*NoNoSingle words
scanf %49sYesNoNoSingle words
getsNoYesNoNEVER USE
fgetsYesYesYesFull lines

*Can be made safe with width specifier


Accessing Individual Characters

Using Array Indexing

char str[] = "Hello";

char first = str[0];   // 'H'
char last = str[4];    // 'o'
str[0] = 'J';          // str is now "Jello"

Using Pointer Arithmetic

char str[] = "Hello";
char *ptr = str;

char first = *ptr;           // 'H'
char second = *(ptr + 1);    // 'e'
ptr++;                       // ptr now points to 'e'

Iterating Through a String

// Method 1: Index-based
char str[] = "Hello";
for (int i = 0; str[i] != '\0'; i++) {
    printf("%c ", str[i]);
}

// Method 2: Pointer-based
char *ptr = str;
while (*ptr != '\0') {
    printf("%c ", *ptr);
    ptr++;
}

// Method 3: Compact pointer style
for (char *p = str; *p; p++) {
    printf("%c ", *p);
}

String Length

Using strlen()

#include <string.h>

char str[] = "Hello";
size_t len = strlen(str);  // Returns 5 (doesn't count \0)

Manual Length Calculation

int my_strlen(const char *str) {
    int len = 0;
    while (str[len] != '\0') {
        len++;
    }
    return len;
}

// Or with pointers:
int my_strlen_ptr(const char *str) {
    const char *s = str;
    while (*s) s++;
    return s - str;
}

Length vs Size

char str[100] = "Hello";

strlen(str);   // 5 (actual string length)
sizeof(str);   // 100 (total array size)

Common String Operations

Copying Strings

#include <string.h>

char src[] = "Hello";
char dest[20];

// Using strcpy (unsafe - no bounds checking)
strcpy(dest, src);

// Using strncpy (safer - limits bytes copied)
strncpy(dest, src, sizeof(dest) - 1);
dest[sizeof(dest) - 1] = '\0';  // Ensure null termination

Concatenating Strings

char str1[20] = "Hello";
char str2[] = " World";

// Using strcat (unsafe)
strcat(str1, str2);  // str1 = "Hello World"

// Using strncat (safer)
strncat(str1, str2, sizeof(str1) - strlen(str1) - 1);

Comparing Strings

char str1[] = "Hello";
char str2[] = "Hello";

// WRONG: This compares addresses, not content!
if (str1 == str2)  // False! Different addresses

// CORRECT: Use strcmp
if (strcmp(str1, str2) == 0) {
    printf("Strings are equal\n");
}

strcmp() Return Values

ResultMeaning
0Strings are equal
< 0str1 comes before str2 alphabetically
> 0str1 comes after str2 alphabetically

String Literals

Characteristics

"Hello"   // String literal - stored in read-only memory
  1. •Immutable - Cannot be modified
  2. •Static storage - Exist for program duration
  3. •Shared - Compiler may merge identical literals

String Literal Concatenation

// Adjacent string literals are concatenated at compile time
char *msg = "Hello "
            "World "
            "!";
// Equivalent to: "Hello World !"

Escape Sequences in Strings

SequenceMeaning
\nNewline
\tTab
\\Backslash
\"Double quote
\'Single quote
\0Null character
\rCarriage return
\xNNHex value NN
printf("Line 1\nLine 2\n");      // Two lines
printf("Tab\there\n");           // Tab character
printf("Quote: \"Hello\"\n");    // Embedded quotes
printf("Path: C:\\Users\\\n");   // Backslashes

Best Practices

1. Always Ensure Null Termination

char str[5];
strncpy(str, "Hello", 5);
// DANGER: No room for '\0'!

// CORRECT:
char str[6];
strncpy(str, "Hello", 5);
str[5] = '\0';

// OR:
strncpy(str, "Hello", sizeof(str) - 1);
str[sizeof(str) - 1] = '\0';

2. Use Safe Functions

// Prefer these:
strncpy(dest, src, n);    // Instead of strcpy
strncat(dest, src, n);    // Instead of strcat
fgets(str, n, stdin);     // Instead of gets
snprintf(str, n, ...);    // Instead of sprintf

3. Check Buffer Sizes

char buffer[100];
char input[200];

// Check before copying
if (strlen(input) < sizeof(buffer)) {
    strcpy(buffer, input);
}

4. Initialize Strings

char str[100] = "";        // Initialize to empty string
char str2[100] = {0};      // Initialize all to null

5. Use const for Read-Only Strings

void print_message(const char *msg) {
    printf("%s\n", msg);
    // msg[0] = 'X';  // Compiler error - protected
}

6. Prefer String Literals with const

const char *msg = "Hello";  // Correct - clearly read-only
char *msg = "Hello";        // Works but misleading

Summary

Key Points

  1. •C strings are null-terminated character arrays
  2. •Null terminator ('\0') marks the end of a string
  3. •String literals are stored in read-only memory
  4. •Array strings can be modified; pointer-to-literal strings cannot
  5. •strlen() returns length without null terminator
  6. •sizeof() returns total allocated size
  7. •Use safe functions (strncpy, strncat, fgets) to prevent buffer overflow

String Declaration Quick Reference

// Modifiable strings:
char str1[] = "Hello";           // Auto-sized array
char str2[20] = "Hello";         // Fixed-size array
char str3[20] = {'H','i','\0'}; // Character init

// Read-only (pointer to literal):
const char *str4 = "Hello";      // Pointer to literal

Common Mistakes to Avoid

  1. •Forgetting null terminator when manually building strings
  2. •Using == to compare strings instead of strcmp()
  3. •Modifying string literals (undefined behavior)
  4. •Buffer overflow with strcpy/strcat on small buffers
  5. •Off-by-one errors when allocating string memory
  6. •Using gets() - always use fgets() instead

Memory Size Calculation

char str[] = "Hello";  // Total bytes = 6 (5 chars + 1 null)

// For dynamic allocation:
char *copy = malloc(strlen(original) + 1);  // +1 for null!
Strings - C Programming Tutorial | DeepML