通过C语言入门计算机科学7：数据结构

本文系用户投稿，不代表机核网观点

⚠️ 未经作者授权禁止转载

当我们掌握了指针，以及栈内存、堆内存的概念，我们就可以用这些工具，来编写一些功能更强大的代码，譬如说“字符串”。前面我们了解的 “C字符串”，实际上只是一块内存，每个字节放一个数字代表字符，数字 0 表示结束。这种表达字符串的方法，在使用上有个缺点：如果要拼接两个或者更多的字符串，我们必须重新申请内存，然后复制内容到新内存上，既消耗内存又浪费时间；同样的，如果要删除字符串其中一部分，或者是在某个位置插入另外一个字符串到本字符串，都是需要重新申请内存然后复制。
所以，我们可以开始自己做一个简单的“字符串”，它要能快速的拼接、插入、删除部分字符串。为了实现这个目标，我们可以用一个叫做“链表”的概念，来完成上述任务。 
定义一个链表（结构体）
所谓的链表，就是一条由多个节点构造成的结构。每个“节点”的内存块中，都有一个指针，指向下一个“节点”。如果需要往这个链表中加入一个或者多个节点，只要修改这些节点的指针就可以了。
#include <stdio.h>
#include <stdlib.h>

struct node
{
    char val;
    struct node *next;
};


   
    
  
   
    
  
   
    
  

这段代码的 struct node{...} 表示定义了一个“结构体”，类型叫 node，也就是定义了一种新的变量类型 node。这种类型的变量，内部包含两个部分的数据，一个是字符，另外一个是结构体 node 的指针。代码上可以用 xx.val 和 xx.node 来使用这里面的数据，如果你获得的是一个 struct 的指针，则用 xx->val 和 xx->node 来使用。在这里，我们会用 NULL 这个字眼来代替数字 0，表示指针没有指向有意义的内存，实际上指针内存里面存的就是 0。
定义相关函数
接下来我们开始定义功能函数：
1. 在字符串的某个位置插入其他字符串，包含追加和新建字符串
// 插入一个字符串到链表中，在第 pos 个位置插入
// 如果 str 为 NULL，则在第 0 个位置插入
// 如果 pos 为 -1，则在最后一个位置插入
// val 是一个 C 字符串，以 \0 结尾
struct node *insert(struct node *str, int pos, char *val)
{
    // 遍历链表，找到第 pos 个位置
    struct node *p = str;
    int cur = 0;
    while (str != NULL && (pos < 0 || cur < pos) && p->next != NULL)
    {
        p = p->next;
        cur++;
    }

    // 插入字符串
    for (int i = 0; val[i] != 0; i++)
    {
        struct node *new_node = malloc(sizeof(struct node));
        new_node->val = val[i];
        new_node->next = NULL;

        if (p == NULL)
        {
            str = new_node;
        }
        else
        {
            new_node->next = p->next;
            p->next = new_node;
        }

        p = new_node;
    }

    return str;
}



   
    
  
   
    
  
   
    
  

2. 在字符串的某个位置删除某个字符串
// 删除链表中第 pos 个位置的 len 个字符
struct node *delete(struct node *str, int pos, int len)
{
    if (str == NULL)
        return NULL;

    // 遍历链表，找到第 pos 个位置
    struct node *prev = str;
    struct node *p = str;
    int cur = 0;
    while (cur < pos && p->next != NULL)
    {
        prev = p;
        p = p->next;
        cur++;
    }

    // 删除 len 个字符
    for (int i = 0; i < len; i++)
    {
        if (p == NULL)
            break;
        struct node *next = p->next;
        if (prev == p) // 如果是第一个节点
        {
            str = next;
            prev = next;
        }
        else
        {
            prev->next = next;
        }
        free(p);
        p = next;
    }

    return str;
}


   
    
  
   
    
  
   
    
  

3. 把这个字符串的内容以 C 字符数组的形式返回
// 返回值需要是一个 C 字符串，以 \0 结尾，需要自己释放内存
char *get(struct node *str)
{
    if (str == NULL)
        return NULL;

    // 遍历链表，找到看看有多少个字符
    int len = 1;
    struct node *p = str;
    while (p->next != NULL)
    {
        p = p->next;
        len++;
    }

    // 分配内存
    char *output = malloc(len + 1);

    // 遍历链表，复制字符
    p = str;
    int i = 0;
    while (p != NULL)
    {
        output[i] = p->val;
        p = p->next;
        i++;
    }
    output[i] = 0;
    return output;
}

最后写一段测试代码：
int main()
{
    struct node *str = insert(NULL, 0, "Hello, world!");
    str = insert(str, -1, " from C.");
    char *output = get(str);
    printf("output: %s\n", output);
    free(output);

    str = delete (str, 0, 7);
    output = get(str);
    printf("output: %s\n", output);
    free(output);
}

返回内容：
output: Hello, world! from C.
output: world! from C.

小结
上面这段代码，显然距离一个完整的“字符串”功能来说还很不够，而且性能也不是特别好，get() 返回堆内存指针本身也不够安全……但是通过这个例子，我们完成了一个最基本，也是最常见的数据结构的雏形：链表。这种通过指针互相关联起来的内存块所构造的数据结构，才是真正解决大量现实问题的常用工具。链表本身也有单向链表、双向链表、数组链表等很多不同的设计，用来解决各种不同的用途。链表以外，还有诸如哈希表、二叉树、跳跃表等等很多数据结构。计算机课程有一门基础课叫《算法导论》或者《数据结构》，讲的就是如何设计各种数据结构和算法，来解决一些常见的问题。对于有指针，可以随意控制内存的 C 语言来说，是更适合构造这些数据结构的语言。