Essential proteins are those necessary for the survival or reproduction of species and discovering such essential proteins is fundamental for understanding the minimal requirements for cellular life, which is also meaningful to the disease study and drug design. With the development of high-throughput techniques, a large number of Protein-Protein Interactions(PPIs) can be used to identify essential proteins at the network level. Up to now, though a series of network-based computational methods have been proposed, it is still a challenge to improve the prediction precision as the high false positives in PPI networks. In this paper, we propose a new method GOS to identify essential proteins by integrating the Gene expressions, Orthology, and Subcellular localization information.The gene expressions and subcellular localization information are used to determine whether a neighbor in the PPI network is reliable. Only reliable neighbors are considered when we analyze the topological characteristics of a protein in a PPI network. We also analyze the orthologous attributes of each protein to reflect its conservative features, and use a random walk model to integrate a protein's topological characteristics and its orthology. The experimental results on the yeast PPI network show that the proposed method GOS outperforms the ten existing methods DC, BC, CC, SC, EC, IC, NC, Pe C, ION, and CSC.
Min LiZhibei NiuXiaopei ChenPing ZhongFangxiang WuYi Pan
Proteins interact with each other to form protein complexes, and cell functionality depends on both protein interactions and these complexes. Based on the assumption that protein complexes are highly connected and correspond to the dense regions in Protein-protein Interaction Networks(PINs), many methods have been proposed to identify the dense regions in PINs. Because protein complexes may be formed by proteins with similar properties,such as topological and functional properties, in this paper, we propose a protein complex identification framework(KCluster). In KCluster, a PIN is divided into K subnetworks using a K-means algorithm, and each subnetwork comprises proteins of similar degrees. We adopt a strategy based on the expected number of common neighbors to detect the protein complexes in each subnetwork. Moreover, we identify the protein complexes spanning two subnetworks by combining closely linked protein complexes from different subnetworks. Finally, we refine the predicted protein complexes using protein subcellular localization information. We apply KCluster and nine existing methods to identify protein complexes from a highly reliable yeast PIN. The results show that KCluster achieves higher Sn and Sp values and f-measures than other nine methods. Furthermore, the number of perfect matches predicted by KCluster is significantly higher than that of other nine methods.
Deep learning provides exciting solutions in many fields, such as image analysis, natural language processing, and expert system, and is seen as a key method for various future applications. On account of its non-invasive and good soft tissue contrast, in recent years, Magnetic Resonance Imaging(MRI) has been attracting increasing attention. With the development of deep learning, many innovative deep learning methods have been proposed to improve MRI image processing and analysis performance. The purpose of this article is to provide a comprehensive overview of deep learning-based MRI image processing and analysis. First, a brief introduction of deep learning and imaging modalities of MRI images is given. Then, common deep learning architectures are introduced. Next, deep learning applications of MRI images, such as image detection, image registration, image segmentation, and image classification are discussed. Subsequently, the advantages and weaknesses of several common tools are discussed, and several deep learning tools in the applications of MRI images are presented.Finally, an objective assessment of deep learning in MRI applications is presented, and future developments and trends with regard to deep learning for MRI images are addressed.
Jin LiuYi PanMin LiZiyue ChenLu TangChengqian LuJianxin Wang
Essential proteins are vital to the survival of a cell. There are various features related to the essentiality of proteins, such as biological and topological features. Many computational methods have been developed to identify essential proteins by using these features. However, it is still a big challenge to design an effective method that is able to select suitable features and integrate them to predict essential proteins. In this work, we first collect 26 features, and use SVM-RFE to select some of them to create a feature space for predicting essential proteins, and then remove the features that share the biological meaning with other features in the feature space according to their Pearson Correlation Coefficients(PCC). The experiments are carried out on S. cerevisiae data. Six features are determined as the best subset of features. To assess the prediction performance of our method, we further compare it with some machine learning methods, such as SVM, Naive Bayes, Bayes Network, and NBTree when inputting the different number of features. The results show that those methods using the 6 features outperform that using other features, which confirms the effectiveness of our feature selection method for essential protein prediction.
Jiancheng ZhongJianxin WangWei PengZhen ZhangMin Li
The recent breakthroughs in next-generation sequencing technologies, such as those of Roche 454,Illumina/Solexa, and ABI SOLID, have dramatically reduced the cost of producing short reads of the genome of new species. The huge volume of reads, along with short read length, high coverage, and sequencing errors, poses a great challenge to de novo genome assembly. However, the paired-end information provides a new solution to these problems. In this paper, we review and compare some current assembly tools, including Newbler, CAP3, Velvet,SOAPdenovo, AllPaths, Abyss, IDBA, PE-Assembly, and Telescoper. In general, we compare the seed extension and graph-based methods that use the overlap/lapout/consensus approach and the de Bruijn graph approach for assembly. At the end of the paper, we summarize these methods and discuss the future directions of genome assembly.
Yiming HeZhen ZhangXiaoqing PengFangxiang WuJianxin Wang
Brain tumor segmentation aims to separate the different tumor tissues such as active cells, necrotic core,and edema from normal brain tissues of White Matter(WM), Gray Matter(GM), and Cerebrospinal Fluid(CSF). MRIbased brain tumor segmentation studies are attracting more and more attention in recent years due to non-invasive imaging and good soft tissue contrast of Magnetic Resonance Imaging(MRI) images. With the development of almost two decades, the innovative approaches applying computer-aided techniques for segmenting brain tumor are becoming more and more mature and coming closer to routine clinical applications. The purpose of this paper is to provide a comprehensive overview for MRI-based brain tumor segmentation methods. Firstly, a brief introduction to brain tumors and imaging modalities of brain tumors is given. Then, the preprocessing operations and the state of the art methods of MRI-based brain tumor segmentation are introduced. Moreover, the evaluation and validation of the results of MRI-based brain tumor segmentation are discussed. Finally, an objective assessment is presented and future developments and trends are addressed for MRI-based brain tumor segmentation methods.
Jin LiuMin LiJianxin WangFangxiang WuTianming LiuYi Pan
Evidence shows that biological systems are composed of separable functional modules. Identifying protein complexes is essential for understanding the principles of cellular functions. Many methods have been proposed to mine protein complexes from protein-protein interaction networks. However, the performances of these algorithms are not good enough since the protein-protein interactions detected from experiments are not complete and have noise. This paper presents an analysis of the topological properties of protein complexes to show that although proteins from the same complex are more highly connected than proteins from different complexes, many protein complexes are not very dense (density ≥0.8). A method is then given to mine protein complexes that are relatively dense (density ≥0.4). In the first step, a topology property is used to identify proteins that are probably in a same complex. Then, a possible boundary is calculated based on a minimum vertex cut for the protein complex. The final complex is formed by the proteins within the boundary. The method is validated on a yeast protein-protein interaction network. The results show that this method has better performance in terms of sensitivity and specificity compared with other methods. The functional consistency is also good.