Understanding the Open Address Method Using Double Hash

I am currently running AIZU ONLINE JUDEGE ALDS1_4_C.
Among them, I am learning how to open addressing using double hashing.

Here's the code for open addressing:

#include<stdio.h>
# include <string.h>

using namespace std;
using ll = long long;

#define M1000003
#define L14

char H[M][L]; /* Hash Table*/

int GetChar(charge){
    if(ch=='A')return1;
    else if(ch=='C') return2;
    else if(ch=='G') return3;
    else if(ch=='T') return 4;
    return 0;
}

ll GetKey(charstr[]){
    ll sum = 0, p = 1, i;
    for(i=0;i<strlen(str);i++){
        sum+=p*(GetChar(str[i])));
        p* = 5;
    }
    return sum;
}

int Hash1(ll key) {return key %M;}
int Hash2(ll key) {return(1+(key%(M-1));}

int find(char str[]){
    ll key = GetKey(str);
    intk = 0;
    while(true){
        int hash_index=(Hash1(key)+k*Hash2(key))%M;
        if(strcmp(H[hash_index], str)==0)return1;
        if(strlen(H[hash_index])==0){
            return 0;
        }
        k++;
    }
    return 0;
}

insert(charstr[]){
    ll key = GetKey(str);
    intk = 0;
    while(true){
        ll hash_index=(Hash1(key)+k*Hash2(key))%M;
        if(strcmp(H[hash_index], str)==0)return1;
        if(strlen(H[hash_index])==0){
            strcpy(H[hash_index], str);
            return1;
        }
        k++;
    }
    return 0;
}

int main() {
    inti,n;
    char str [L], com [9];
    for(i=0;i<M;i++)H[i][0]='\0';

    scanf("%d", & n);

    for(i=0;i<n;i++){
        scanf("%s%s",com,str);

        if(com[0]=='i'){
            insert(str);
        } else {
            if(find(str)){
                printf("yes\n");
            } else {
                printf("no\n");
            }
        }
    }

    return 0;
}

I don't know what I'm doing in the GetKey function that changes characters to numbers.
Specifically,

sum+=p*(GetChar(str[i])));
p* = 5;

I don't know what this p means.
If the character is similar, the index value that you changed to hash function will be close
To avoid this, do you multiply p by the number of strings or multiply p by 5 for each loop that ends?
On the Internet, I saw that is multiplying 5 by a number from 1 to 4, so I don't understand what it means.

I look forward to your kind cooperation.


algorithm
data-structure
					
					

	


		
	

	
		2022-10-22 09:15



			

			
			1 Answers


	
		
Simply put, the ACGT string is [no characters]It is considered as a key by considering that = 0, A = 1, C = 2, G = 3, and T = 4.The string shown in the sample input is expressed as a decimal number (0pXXX):
AAA=0p111=31
AAC=0p211=56#Note that the first character equals the first digit.
CCC = 0p222 = 62
AGA = 0p131 = 41
AGG = 0p331 = 91
TTT = 0p444 = 124
T = 0p4 = 4
As shown in the example above, the first character is the first digit, so multiply by 1, but the second character is 5, the third character is 25(5*5) and the n-th character is multiplied by the n-th power of 5, where p*=5; is the n-th power p.
In this method, strings and keys are one-to-one.In the code, Key is used only to find hash values, but it should be faster to use Key directly in the table.Also, since up to 12 characters TTTTTTTTTT=0p44444444444444=244140624, long should be enough instead of long, so you can optimize that much.


		
		
			

				

					
				

				
					2022-10-22 09:15
				
			
		
	
			
			If you have any answers or tips



		

	
		Popular Tags
	
	python x 4647
android x 1593
java x 1494
javascript x 1427
c x 927
c++ x 878
ruby-on-rails x 696
php x 692
python3 x 685
html x 656
	


	
		Popular Questions
	
	
	869 GDB gets version error when attempting to debug with the Presense SDK (IDE)

	818 I'm in trouble because I can't do anything about the error.TypeError: loop of ufunc does not support argument 0 of type Float which has no callable exp method

	1019 M2 Mac fails to install rbenv install 3.1.3 due to errors

	1009 Error in x, y, and format string must not be None

	1553 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error