Computer Science > Information Retrieval
[Submitted on 25 Sep 2020]
Title:Parsisanj: a semi-automatic component-based approach towards search engine evaluation
View PDFAbstract:Accessing to required data on the internet is wide via search engines in the last two decades owing to the huge amount of available data and the high rate of new data is generating daily. Accordingly, search engines are encouraged to make the most valuable existing data on the web searchable. Knowing how to handle a large amount of data in each step of a search engines' procedure from crawling to indexing and ranking is just one of the challenges that a professional search engine should solve. Moreover, it should also have the best practices in handling users' traffics, state-of-the-art natural language processing tools, and should also address many other challenges on the edge of science and technology. As a result, evaluating these systems is too challenging due to the level of internal complexity they have, and is crucial for finding the improvement path of the existing system. Therefore, an evaluation procedure is a normal subsystem of a search engine that has the role of building its roadmap. Recently, several countries have developed national search engine programs to build an infrastructure to provide special services based on their needs on the available data of their language on the web. This research is conducted accordingly to enlighten the advancement path of two Iranian national search engines: Yooz and Parsijoo in comparison with two international ones, Google and Bing. Unlike related work, it is a semi-automatic method to evaluate the search engines at the first pace. Eventually, we obtained some interesting results which based on them the component-based improvement roadmap of national search engines could be illustrated concretely.
Submission history
From: Amin Heydari Alashti [view email][v1] Fri, 25 Sep 2020 09:17:57 UTC (222 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.