Algorithm: Sorting

Watching “Aha! Algorithms”, I plan to make a reading note.

The first chapter is sorting, which describes three methods: bucket sort, bubble sort and quick sort.

bucket sort

I found it convenient at first, so I copied the C code in the original book with python I don’t know C language

The principle of this method is like when sorting a column of numbers, how big is the maximum (n) in this column of numbers, set (n+1) buckets (because 0 is to be included). Reading a number in the list (eg a), then a flag is placed on the ath bucket. Finally, traverse each bucket and display the bucket with the flag to complete the sorting.

``` python
flag = True
number_list = [] #input number list
book = [] #bucket

#Let the user enter a number, add the number to the list, and end the input if the user enters “done”
while flag:
number = input(“give me number to input:\n”)
if number == ‘done’:
flag = False
break
number_list.append(int(number))
flag = True

#Number of buckets, plus 1 because it contains 0
length = int(max(number_list)) + 1

#Set the number of flags in these buckets to 0
for i in range(length):
book.append(i)
book[i] = 0

#Insert the numbers in the number list as flags into the corresponding bucket
for i in number_list:
book[int(i)] += 1

Traverse the buckets from large to small, range(a,b,-1) a is the starting point of the traversal (inclusive), b is the end point (exclusive)

for i in range(len(book)-1, -1, -1):
#If the number of flags is greater than or equal to 1
if book[i] >= 1:
#Print several times if there are several flags
for j in range(0, int(book[i])):
print(i)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
If the input is (~~Replace the Enter key with a space to save space~~)

`10 320 44 44 32 10 done`

then the output is

`320 44 44 32 10 10`
In terms of time complexity, a total of `O(M+M+N+M+N)` times, that is, ignoring coefficients 2 and 3, a total of `O(M+N)` times. This is a very fast sorting algorithm, but the disadvantage is also obvious: if we only need to compare 5 4-digit numbers (such as `1000, 2000, 4444, 3333, 9999`), we still have to build `10000` A bucket, a huge waste of space! Also if we need to sort decimals like `1.12,2.333,2.33343`, bucket sort doesn't seem to be useful.

At this time we introduce another sorting method, bubble sort.

---
# Bubble Sort
The basic idea of ​​bubble sort is to compare two adjacent elements each time and swap them if they are in the wrong order

For example: say there are five numbers: 2, 4, 3, 9, 7

If you want to sort from largest to smallest, first compare the first number `2`

Since `2` is smaller than `4`, swap the positions:

4, `2`, 3, 9, 7 first traversal

In the same way, then compare with 3:

4, 3, `2`, 9 ,7

4, 3, 9, `2`, 7

4, 3, 9, 7, `2` traverse 4 times

Because `2` is swapped from the beginning to the end, just like bubbles surfaced, so it is vividly called bubble sort

The same is true for the following:

`4`, 3, 9, 7, 2 second pass

4, `3`, 9, 7, 2

4, 9, `3`, 7, 2

4, 9, 7, `3`, 2 traverse 3 times (2 is already the smallest)

`4`, 9, 7, 3, 2 3rd traversal

9, `4`, 7, 3, 2

9, 7, `4`, 3, 2 traverse 2 times

`9`, 7, 4, 3, 2 Fourth pass

9, `7`, 4, 3, 2 traverse 1 time

One problem with bucket sorting is that if the numbers are followed by names, the final output is only numbers and the names will not appear together, which can be solved by bubble sorting of a two-dimensional list
``` python
# enter the total list
total_list = []

#Enter the number, enter the name, enter the number and name as a list into the total list
flag = True
while flag:
cache_list = []
number = input("give me number to input:\n")
cache_list.append(number)
name = input("give me name to input:\n")
cache_list.append(name)
total_list.append(cache_list)
check = input("Done?")
if check == 'yes' or check == 'Yes':
flag = False
else:
flag = True

# Traverse (n-1) times with n inputs
for i in range(len(total_list)-1):
#As demonstrated above, it is no longer necessary to compare the already sorted, saving space
for j in range(len(total_list)-1-i):
#If a<b, replace ab
if int(total_list[j][0]) < int(total_list[j+1][0]):
cache = total_list[j]
total_list[j] = total_list[j+1]
total_list[j+1] = cache

#Print matching names from high to low according to the number
for i in range(0, len(number_list)):
print(total_list[i][1])

Because bubble sort is a double nested loop, the time complexity is O(N2).

If our computer can run 1 billion times per second, then sorting 100 million numbers, bucket sort takes only 0.1 seconds, and bubble sort takes 10 million seconds, which is 115 days long!


quicksort

Suppose we sort the seven numbers 9, 4, 3, 7, 2, 6, 5

First set the leftmost 9 as the base number, then find numbers smaller than 9 from right to left, and find numbers larger than 9 from left to right:

9, 4, 3, 7, 2, 6, 5 from right to left –> 5 is a number less than 9

9, 4, 3, 7, 2, 6, 5 From left to right –> 9 does not encounter a larger number than him, but encounters 5, when two numbers meet, then this number Swap with base number

5, 4, 3, 7, 2, 6, 9 Do this step again (5 is the base number) from right to left –> 2 is smaller than 5

5, 4, 3, 7, 2, 6, 9 from left to right –> 7 is bigger than 5

5, 4, 3, 2, 7, 6, 9 swap 2, 7

2, 4, 3, 5, 7, 6, 9 From right to left –> 7 coincides with 2, and exchanges positions with the base number 5

In this way, the left side of the benchmark number 5 is smaller than 5, and the right side is larger than 5. Continue to sort the left and right sides of 5 in this way:

2, 4, 3, 5, 6, 7, 9

2, 3, 4, 5, 6, 7, 9 this is all sorted

The reason why quicksort is faster is because each exchange is a jump compared to bubble sort. When sorting, set a reference point, put all the books less than or equal to the reference point to the left of the reference point, and put all the books greater than or equal to the reference point to the right of the reference point. In this way, in each exchange, it will not only be exchanged between adjacent numbers like bubble sort, and the exchange distance will be much larger. Of course, in the worst case, two adjacent numbers may still be exchanged. Therefore, the worst time complexity of quicksort is the same as that of bubble sort, O(N2), and its average time complexity is O(NlogN).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#Enter a list of numbers
number_list = []

#Define the quicksort zoom method
def quicksort(left, right):
#If the leftmost is greater than the rightmost, the operation has been completed, return directly
if left > right:
return
temp = int(number_list[left]) #The leftmost number in the list (reference number)
i = left
j = right
while i != j: #If i and j are not in contact
while int(number_list[j]) >= temp and i < j: #Start from the rightmost number in the list to find a number smaller than the base number
j -= 1

while int(number_list[i]) <= temp and i < j: #Start from the leftmost number in the list to find numbers larger than the base number
i += 1

if i < j: #If the two numbers found do not overlap, swap the two numbers
t = int(number_list[i])
number_list[i] = number_list[j]
number_list[j] = t


number_list[left] = number_list[i] #When the last two numbers are in contact, the reference number and the contacted number exchange positions
number_list[i] = temp
quicksort (left, i-1) # sort the left side of the reference number
quicksort(i+1, right) #sort the right side of the reference number

#User enters a number, 'done' ends the input
flag = True
while flag:
number = input("give me number to input:\n")
if number == 'done':
flag = False
break
number_list.append(number)
flag = True

#The leftmost number is 0, and the rightmost number is length-1 (because it contains 0)
quicksort(0, len(number_list)-1)
print(number_list)