단백질의 세포내 위치 예측을 위한 강화된 접미사 배열 기반의 고속 서열탐색

지상문; Sang-Mun Chi

연구문헌

국내 논문지

홈 > 연구문헌 > 국내 논문지 > 한국정보과학회 논문지 > 정보과학회 논문지 B : 소프트웨어 및 응용

정보과학회 논문지 B : 소프트웨어 및 응용

Current Result Document : 173 / 270 이전건 다음건

한글제목(Korean Title)	단백질의 세포내 위치 예측을 위한 강화된 접미사 배열 기반의 고속 서열탐색
영문제목(English Title)	Fast Sequence Search Based on Enhanced Suffix Arrays for Prediction of Protein Subcellular Localization
저자(Author)	지상문 Sang-Mun Chi
원문수록처(Citation)	VOL 40 NO. 09 PP. 0483 ~ 0490 (2013. 09)
한글내용 (Korean Abstract)	단백질의 세포내 위치를 예측하는 많은 방법들은 질의 단백질과 서열 유사성이 높은 단백질의 정보를 이용한다. 본 논문은 이러한 서열 유사성이 큰 단백질들을 고속으로 찾는 방법을 제안한다. 이를 위해, 유전체 데이터베이스에서 질의 DNA 서열의 위치를 찾는데 이용되는 강화된 접미사 배열을 단백질 데이터베이스 탐색에 적합하게 수정한다. 강화된 접미사배열의 하향식 순회 탐색과 이전 탐색결과의 재사용을 이용하여 데이터베이스내의 단백질 중에서 질의 서열의 부분 서열들과 자주 일치하는 서열들을 데이터베이스 크기와 무관하게 질의서열 길이의 선형 시간 복잡도로 찾는다. 찾아진 서열들에 대해서 스미스-워터만 알고리즘을 사용하여 최종 유사 단백질을 찾는다. 제안 방법은 서열탐색에 가장 널리 쓰이는 BLAST에 비해서 약 300배의 빠른 탐색 속도를 보였고, 단백질의 세포내 위치예측에 적용할 경우 BLAST를 사용하는 방법에 비하여 정확성이 향상되었다.
영문내용 (English Abstract)	For predicting subcellular localization of proteins, many methods exploit information of proteins having high sequence similarity to a query sequence. This paper proposes a fast sequence search method to find these highly similar proteins in database. For protein database search, we adopt enhanced suffix arrays which are used for finding the position of query DNA sequences in genome database. We use top-down traversal and reuse of previously searched results of enhanced suffix arrays for fast search. The time complexity for searching candidate proteins having many exact matches to the sub-sequences of a query protein is proportional only to the length of the query sequence, not dependent on database size. Smith-Waterman algorithm is applied to find the most similar protein in these candidate proteins. Comparing with most widely used search method BLAST, the proposed method shows 300 times faster search speed and gives higher prediction accuracies in protein subcellular localization prediction.
키워드(Keyword)	단백질의 세포내 위치 예측 서열 탐색 강화된 접미사 배열 스미스-워터만 알고리즘 BLAST protein subcellular localization prediction sequence search enhanced suffix arrays smith-waterman algorithm BLAST
파일첨부	PDF 다운로드