xv6-labs-2020.lab6.Multi threading

Posted on 2021-05-02 Edited on 2021-07-22 In OS.xv6-labs Views:

多线程

lab6 Multi-threading

1.作业链接

https://pdos.csail.mit.edu/6.828/2020/labs/thread.html

2. 实习内容

Uthread: switching between threads

实现用户态的线程调度
这一部分主要是是实现在线程切换的时候的寄存器的保存与恢复
具体要做的是修改如下代码，实现上述功能
- user/uthread.c 中的 thread_create() 、thread_schedule()
- user/uthread_switch.S 中的 thread_switch
具体功能
- 当 thread_schedule() 调度到一个线程，而且这个线程是第一次被调度时，这个线程应该转向 thread_create()，而且需要在它自己的栈上执行
- thread_swtich() 实现寄存器的保存与恢复
  - 可以保存在 struct thread 中
  - 在 thread_schedule() 中调用 thread_switch()
注意这里的线程调度实现：user/uthread
- 事先准备好线程的所有线程的上下文，然后再进行调度，因此 thread_schedule() 只需要设置好具体执行上下文的内容，而不需要修改寄存器的值

int main(int argc, char *argv[]) {
    a_started = b_started = c_started = 0;
    a_n = b_n = c_n = 0;
    thread_init();
    // 先准备好线程, 最后再开始调度
    thread_create(thread_a);
    thread_create(thread_b);
    thread_create(thread_c);
    thread_schedule();
    exit(0);
}

(1) user/uthread.c

这一部分的实现可以参考 kernel/swtch.S、kernel/proc.h

struct thread

修改数据结构 struct thread
开辟一块区域，用于切换线程的时候的寄存器保存
- 只需要保存 callee-save 寄存器，因为 caller-saved 寄存器在调用函数之前就被保存在栈上了，而每一个线程的栈是独立的，不会修改其他线程的栈，到时候执行权回到该线程的时候，可以直接通过栈恢复 caller-saved 寄存器
RISC-V 中的寄存器如下

数据结构修改如下

// 需要保存的寄存器
struct thread_frame {
    // 通过记录这些寄存器实现切换
    // uint64 tp; 好像不会用到这个寄存器
    uint64 ra;
    uint64 sp;

    // callee-saved
    uint64 s0;
    // ...
    uint64 s11;
};

struct thread {
    // ...
    // 需要保存的寄存器
    struct thread_frame tf;
};

定义函数 thread_switch(&old, &new) 的作用如下
- 将当前的寄存器状态保存到 old 地址中
- 使用 new 地址中的寄存器代替当前的寄存器状态

thread_schedule()

添加切换线程的函数

void thread_schedule(void) {
    // ...
    if (current_thread != next_thread) {
        // ...
        // 注意之前修改了变量指向
        thread_switch((uint64)&t->tf, (uint64)&current_thread->tf);
    }
    // ...
}

thread_create()

在新建一个线程的时候需要从数组中找一个状态为 FREE 的线程，将其用于调用
- 这里没有做新建线程时已经没有状态为 FREE 的情况的判断
做完之后将返回值设置为他自己的栈结构
修改上面 strcut threadframe 中的变量 ra、sp

void thread_create(void (*func)()) {
    // ...
    t->tf.ra = (uint64)func;
    // 指向栈底(上往下增长)
    t->tf.sp = (uint64)&t->stack + STACK_SIZE;
}

(2) user/thread_switch.S

根据 thread_switch(&old, &new) 的含义确定如何操作寄存器

.text
# 将当前的寄存器状态保存到 a0 地址中
# 使用 a1 地址中的寄存器代替当前的寄存器状态
.globl thread_switch
thread_switch:
    sd ra, 0(a0)
    sd sp, 8(a0)
    sd s0, 16(a0)
    /* ... */
    sd s11, 104(a0)

    ld ra, 0(a1)
    ld sp, 8(a1)
    ld s0, 16(a1)
    /* ... */
    ld s11, 104(a1)
    ret

Using threads

使用 pthreads 库进行多线程编程
在真正的 Linux 机器上跑代码

(1) 问题

Why are there missing keys with 2 threads, but not with 1 thread? Identify a sequence of events with 2 threads that can lead to a key being missing.
当线程1在插入一个键值对的时候，此时另外一个线程同时读取了未插入的状态开始执行，此时两个线程的插入实际上是只是插入了一个元素
- 具体分析看如下代码

static void put(int key, int value) {
    int i = key % NBUCKET;
    struct entry *e = 0;
    for (e = table[i]; e != 0; e = e->next) {
        if (e->key == key)
            break;
    }
    // 此时线程1运行到这里,同时线程2开始执行 put() 函数
    // 然后在线程1执行到 insert() 之前运行到了这里
    // 那么线程2和线程1插入的位置是同一个, 也就是说最终只插入了一个元素
    if(e){
        e->value = value;
    } else {
        insert(key, value, &table[i], table[i]);
    }
}

(2) notxv6/ph.c

使用 lock 解决，要求总时间（各线程耗时之和）不能超过原来的 1.25 倍
只有 put() 函数需要加锁
看完提示就 OK 了

pthread_mutex_t lock;            // declare a lock
pthread_mutex_init(&lock, NULL); // initialize the lock
pthread_mutex_lock(&lock);       // acquire lock
pthread_mutex_unlock(&lock);     // release lock

解法 1

只加一个锁

static void put(int key, int value) {
    // ...
    pthread_mutex_lock(&lock);       // acquire lock
    for (e = table[i]; e != 0; e = e->next) {
        if (e->key == key)
            break;
    }
    if(e){
        e->value = value;
    } else {
        // the new is new.
        insert(key, value, &table[i], table[i]);
    }
    pthread_mutex_unlock(&lock);     // release lock
}

过不了测试

1 2	ph_safe: OK (37.1s) ph_fast: FAIL (76.1s)

解法2

为每一个 bucket 加一个锁

static void put(int key, int value) {
    int i = key % NBUCKET;
    struct entry *e = 0;
    pthread_mutex_lock(&lock[i]);       // acquire lock
    for (e = table[i]; e != 0; e = e->next) {
        if (e->key == key)
            break;
    }
    if(e){
        e->value = value;
    } else {
        insert(key, value, &table[i], table[i]);
    }
    pthread_mutex_unlock(&lock[i]);     // release lock
}

1 2	ph_safe: OK (29.2s) ph_fast: OK (62.3s)

Barrier

要求
- 实现一个 barrier（应用程序中的一个点），所有参与线程必须在该点等待，直到所有其他参与线程也都到达该点
- Linux 条件变量实现
需要在真正的 Linux 机器上跑代码
看提示
- pthread_cond_wait releases the mutex when called, and re-acquires the mutex before returning.

1 2	pthread_cond_wait(&cond, &mutex); // go to sleep on cond, releasing lock mutex, acquiring upon wake up pthread_cond_broadcast(&cond); // wake up every thread sleeping on cond

实现 notxv6/barrier.c 中的 barrier() 函数即可
同时需要实现支持多次 barrier，每一轮将记录变量 round + 1

(1) notxv6/barrier.c

static void barrier() {
    pthread_mutex_lock(&bstate.barrier_mutex);
    ++bstate.nthread;
    if(bstate.nthread != nthread) {
        pthread_cond_wait(&bstate.barrier_cond, &bstate.barrier_mutex);
        // 注意调用这个函数的时候会释放锁, 凡是返回到这个函数的时候会重新等待获得锁
    } else {
        ++bstate.round;
        bstate.nthread = 0;
        pthread_cond_broadcast(&bstate.barrier_cond);
    }
    // 所以得释放
    pthread_mutex_unlock(&bstate.barrier_mutex);
}

3. 实验结果

$ make qemu-gdb
uthread: OK (13.4s)
== Test answers-thread.txt == answers-thread.txt: OK
== Test ph_safe == make[1]: Entering directory '/home/banbao990/xv6-labs-2020'
make[1]: 'ph' is up to date.
make[1]: Leaving directory '/home/banbao990/xv6-labs-2020'
ph_safe: OK (27.3s)
== Test ph_fast == make[1]: Entering directory '/home/banbao990/xv6-labs-2020'
make[1]: 'ph' is up to date.
make[1]: Leaving directory '/home/banbao990/xv6-labs-2020'
ph_fast: OK (63.8s)
== Test barrier == make[1]: Entering directory '/home/banbao990/xv6-labs-2020'
make[1]: 'barrier' is up to date.
make[1]: Leaving directory '/home/banbao990/xv6-labs-2020'
barrier: OK (12.8s)
== Test time ==
time: OK
Score: 60/60

4. 遇到的困难以及收获

这个 lab 相对比较简单
第一部分就是简单的仿造内核的 swtch.S 以及 proc.h 编写代码即可
后面两个部分就是使用 pthread 库进行多线程代码的编写
锁加的位置对整体的性能可能会有较大影响，需要仔细考虑

5. 对课程或 lab 的意见和建议

暂时没有建议

6. 参考文献

https://pdos.csail.mit.edu/6.828/2019/xv6/book-riscv-rev0.pdf