What is a copyright?
A copyright is the set of exclusive legal rights authors have over their works for a limited period of time. These rights include copying the works (including parts of the works), making derivative works, distributing the works, and performing the works (this means showing a movie or playing an audio recording, as well as performing a dramatic work). Currently, the author's rights begin when a work is created. A work does not have to bear a copyright notice or be registered to be copyrighted.
Why do we have copyright?
The Constitution of the United States says that its purpose is to promote science and the useful arts. The government believed that those who create an original expression in any medium need protection for their work so they can receive appropriate compensation for their intellectual effort.
What is a work in the public domain?
A work in the public domain can be copied freely by anyone. Such works include those of the U.S. Government and works for which the copyright has expired. Generally, for works created after 1978, the copyright lasts for 70 years beyond the life of the author. Works created before but not published before 1978 have special rules. For works created and first published between 1950 and 1978 the copyright lasts for 95 years. For works created and first published before 1950, the copyright lasts for 28 years but could have been renewed for another 67 years.
When planning a project, start by identifying works in the public domain which can be re-purposed in the new work. Request permissions for materials not in the public domain early in the project. If there are images or sounds for which permission to copy cannot be obtained, it is easier to redesign the project at the beginning rather waiting until the project nears completion.
What is fair use?
Fair use provisions of the copyright law allow for limited copying or distribution of published works without the author's permission in some cases. Examples of fair use of copyrighted materials include quotation of excerpts in a review or critique, or copying of a small part of a work by a teacher or student to illustrate a lesson.
How can I tell if my copying is allowed by fair use provisions of the Law?
There are no explicit, predefined, legal specifications of how much and when one can copy, but there are guidelines for fair use. Each case of copying must be evaluated according to four factors:
The purpose and nature of the use.
If the copy is used for teaching at a non-profit institution, distributed without charge, and made by a teacher or students acting individually, then the copy is more likely to be considered as fair use. In addition, an interpretation of fair use is more likely if the copy was made spontaneously, for temporary use, not as part of an "anthology" and not as an institutional requirement or suggestion.
The nature of the copyrighted work.
For example, an article from a newspaper would be considered differently than a workbook made for instruction. With multimedia material there are different standards and permissions for different media: a digitized photo from a National Geographic, a video clip from Jaws, and an audio selection from Peter Gabriel's CD would be treated differently--the selections are not treated as a equivalent chunks of digital data.
The nature and substantiality of the material used.
In general, when other criteria are met, the copying of extracts that are "not substantial in length" when compared to the whole of which they are part may be considered fair use.
The effect of use on the potential market for or value of the work.
In general, a work that supplants the normal market is considered an infringement, but a work does not have to have an effect on the market to be an infringement.
How can a work reference the copyright owner of digital photographs, video, or sounds?
Include the copyright symbol and the name of the copyright owner directly on/under/around the digital material. It is virtually impossible to ensure that digital information located at any distance from the image/video would be seen by a user if the copyright notice is not directly attached to the material.
If the material is only used once for a class or a project, does the copyright owner need to be acknowledged?
Images, graphics and video should be credited to their owners/sources just as written material. Also, if you should change your mind and want to use material for commercial purposes, then it is important that you would know where and when you found the material and who is the copyright owner.
Can I download information to my computer?
Digital resources are licensed for the non-profit educational use of Stanford University. Use of these resources is governed by copyright law and individual license agreements. Systematic downloading, distributing, or retaining substantial portions of information is prohibited
http://www-sul.stanford.edu/cpyright.html
วันจันทร์ที่ 24 พฤศจิกายน พ.ศ. 2551
วันศุกร์ที่ 14 พฤศจิกายน พ.ศ. 2551
Knowledge Management
Knowledge Management (KM) comprises a range of practices used in an organisation to identify, create, represent, distribute and enable adoption of insights and experiences. Such insights and experiences comprise knowledge, either embodied in individuals or embedded in organisational processes or practice. An established discipline since 1995, KM includes courses taught in the fields of business administration, information systems, management, and library and information sciences . More recently, other fields, to include those focused on information and media, computer science, public health, and public policy, also have started contributing to KM research. Many large companies and non-profit organisations have resources dedicated to internal KM efforts, often as a part of their 'Business Strategy', 'Information Technology', or 'Human Resource Management' departments. Several consulting companies also exist that provide strategy and advice regarding KM to these organisations.
KM efforts typically focus on organisational objectives such as improved performance, competitive advantage, innovation, the sharing of lessons learned, and continuous improvement of the organisation. KM efforts overlap with Organisational Learning, and may be distinguished from by a greater focus on the management of knowledge as a strategic asset and a focus on encouraging the exchange of knowledge. KM efforts can help individuals and groups to share valuable organisational insights, to reduce redundant work, to avoid reinventing the wheel per se, to reduce training time for new employees, to retain intellectual capital as employees turnover in an organisation, and to adapt to changing environments and markets .
http://en.wikipedia.org/wiki/Knowledge_management
KM efforts typically focus on organisational objectives such as improved performance, competitive advantage, innovation, the sharing of lessons learned, and continuous improvement of the organisation. KM efforts overlap with Organisational Learning, and may be distinguished from by a greater focus on the management of knowledge as a strategic asset and a focus on encouraging the exchange of knowledge. KM efforts can help individuals and groups to share valuable organisational insights, to reduce redundant work, to avoid reinventing the wheel per se, to reduce training time for new employees, to retain intellectual capital as employees turnover in an organisation, and to adapt to changing environments and markets .
http://en.wikipedia.org/wiki/Knowledge_management
Information System
The term information system (IS) sometimes refers to a system of persons, data records and activities that process the data and information in an organization, and it includes the organization's manual and automated processes. Computer-based information systems are the field of study for information technology, elements of which are sometimes called an "information system" as well, a usage some consider to be incorrect.
History of information systems
The study of information systems originated as a sub-discipline of computer science in an attempt to understand and rationalize the management of technology within organizations. It has matured into a major field of management, that is increasingly being emphasized as an important area of research in management studies, and is taught at all major universities and business schools in the world. Börje Langefors introduced the concept of "Information Systems" at the third International Conference on Information Processing and Computer Science in New York in 1965. [4]
Information technology is a very important malleable resource available to executives.[5] Many companies have created a position of Chief Information Officer (CIO) that sits on the executive board with the Chief Executive Officer (CEO), Chief Financial Officer (CFO), Chief Operating Officer (COO) and Chief Technical Officer (CTO).The CTO may also serve as CIO, and vice versa.
Applications of information systems
Information systems deal with the development, use and management of an organization's IT infrastructure.
In the post-industrial information age, the focus of companies has shifted from being product-oriented to knowledge-oriented in the sense that market operators today compete in process and innovation rather than in products: the emphasis has shifted from the quality and quantity of production to the production process itself--and the services that accompany the production process.
The biggest asset of companies today is their information--represented by people, experience, know-how, innovations (patents, copyrights, trade secrets)--and for a market operator to be able to compete, he or she must have a strong information infrastructure, at the heart of which lies the information technology infrastructure. Thus the study of information systems focuses on why and how technology can be put into best use to serve the information flow within an organization.
http://en.wikipedia.org/wiki/Information_Systems
วันจันทร์ที่ 3 พฤศจิกายน พ.ศ. 2551
Information Technology
Information technology (IT), as defined by the Information Technology Association of America (ITAA), is "the study, design, development, implementation, support or management of computer-based information systems, particularly software applications and computer hardware." IT deals with the use of electronic computers and computer software to convert, store, protect, process, transmit, and securely retrieve information.
Today, the term information technology has ballooned to encompass many aspects of computing and technology, and the term is more recognizable than ever before. The information technology umbrella can be quite large, covering many fields. IT professionals perform a variety of duties that range from installing applications to designing complex computer networks and information databases. A few of the duties that IT professionals perform may include data management, networking, engineering computer hardware, database and software design, as well as the management and administration of entire systems.
When computer and communications technologies are combined, the result is information technology, or "infotech". Information Technology (IT) is a general term that describes any technology that helps to produce, manipulate, store, communicate, and/or disseminate information. Presumably, when speaking of Information Technology (IT) as a whole, it is noted that the use of computers and information are associated.
http://en.wikipedia.org/wiki/Information_technologyToday, the term information technology has ballooned to encompass many aspects of computing and technology, and the term is more recognizable than ever before. The information technology umbrella can be quite large, covering many fields. IT professionals perform a variety of duties that range from installing applications to designing complex computer networks and information databases. A few of the duties that IT professionals perform may include data management, networking, engineering computer hardware, database and software design, as well as the management and administration of entire systems.
When computer and communications technologies are combined, the result is information technology, or "infotech". Information Technology (IT) is a general term that describes any technology that helps to produce, manipulate, store, communicate, and/or disseminate information. Presumably, when speaking of Information Technology (IT) as a whole, it is noted that the use of computers and information are associated.
วันเสาร์ที่ 1 พฤศจิกายน พ.ศ. 2551
Evaluation
Introduction to Evaluation Evaluation is a methodological area that is closely related to, but distinguishable from more traditional social research. Evaluation utilizes many of the same methodologies used in traditional social research, but because evaluation takes place within a political and organizational context, it requires group skills, management ability, political dexterity, sensitivity to multiple stakeholders and other skills that social research in general does not rely on as much. Here we introduce the idea of evaluation and some of the major terms and issues in the field.
Definitions of Evaluation
Probably the most frequently given definition is:
Evaluation is the systematic assessment of the worth or merit of some object
This definition is hardly perfect. There are many types of evaluations that do not necessarily result in an assessment of worth or merit -- descriptive studies, implementation analyses, and formative evaluations, to name a few. Better perhaps is a definition that emphasizes the information-processing and feedback functions of evaluation. For instance, one might say:
Evaluation is the systematic acquisition and assessment of information to provide useful feedback about some object
Both definitions agree that evaluation is a systematic endeavor and both use the deliberately ambiguous term 'object' which could refer to a program, policy, technology, person, need, activity, and so on. The latter definition emphasizes acquiring and assessing information rather than assessing worth or merit because all evaluation work involves collecting and sifting through data, making judgements about the validity of the information and of inferences we derive from it, whether or not an assessment of worth or merit results.
The Goals of Evaluation
The generic goal of most evaluations is to provide "useful feedback" to a variety of audiences including sponsors, donors, client-groups, administrators, staff, and other relevant constituencies. Most often, feedback is perceived as "useful" if it aids in decision-making. But the relationship between an evaluation and its impact is not a simple one -- studies that seem critical sometimes fail to influence short-term decisions, and studies that initially seem to have no influence can have a delayed impact when more congenial conditions arise. Despite this, there is broad consensus that the major goal of evaluation should be to influence decision-making or policy formulation through the provision of empirically-driven feedback.
Evaluation Strategies
'Evaluation strategies' means broad, overarching perspectives on evaluation. They encompass the most general groups or "camps" of evaluators; although, at its best, evaluation work borrows eclectically from the perspectives of all these camps. Four major groups of evaluation strategies are discussed here.
Scientific-experimental models are probably the most historically dominant evaluation strategies. Taking their values and methods from the sciences -- especially the social sciences -- they prioritize on the desirability of impartiality, accuracy, objectivity and the validity of the information generated. Included under scientific-experimental models would be: the tradition of experimental and quasi-experimental designs; objectives-based research that comes from education; econometrically-oriented perspectives including cost-effectiveness and cost-benefit analysis; and the recent articulation of theory-driven evaluation.
The second class of strategies are management-oriented systems models. Two of the most common of these are PERT, the Program Evaluation and Review Technique, and CPM, the Critical Path Method. Both have been widely used in business and government in this country. It would also be legitimate to include the Logical Framework or "Logframe" model developed at U.S. Agency for International Development and general systems theory and operations research approaches in this category. Two management-oriented systems models were originated by evaluators: the UTOS model where U stands for Units, T for Treatments, O for Observing Observations and S for Settings; and the CIPP model where the C stands for Context, the I for Input, the first P for Process and the second P for Product. These management-oriented systems models emphasize comprehensiveness in evaluation, placing evaluation within a larger framework of organizational activities.
The third class of strategies are the qualitative/anthropological models. They emphasize the importance of observation, the need to retain the phenomenological quality of the evaluation context, and the value of subjective human interpretation in the evaluation process. Included in this category are the approaches known in evaluation as naturalistic or 'Fourth Generation' evaluation; the various qualitative schools; critical theory and art criticism approaches; and, the 'grounded theory' approach of Glaser and Strauss among others.
Finally, a fourth class of strategies is termed participant-oriented models. As the term suggests, they emphasize the central importance of the evaluation participants, especially clients and users of the program or technology. Client-centered and stakeholder approaches are examples of participant-oriented models, as are consumer-oriented evaluation systems.
With all of these strategies to choose from, how to decide? Debates that rage within the evaluation profession -- and they do rage -- are generally battles between these different strategists, with each claiming the superiority of their position. In reality, most good evaluators are familiar with all four categories and borrow from each as the need arises. There is no inherent incompatibility between these broad strategies -- each of them brings something valuable to the evaluation table. In fact, in recent years attention has increasingly turned to how one might integrate results from evaluations that use different strategies, carried out from different perspectives, and using different methods. Clearly, there are no simple answers here. The problems are complex and the methodologies needed will and should be varied.
Types of Evaluation
There are many different types of evaluations depending on the object being evaluated and the purpose of the evaluation. Perhaps the most important basic distinction in evaluation types is that between formative and summative evaluation. Formative evaluations strengthen or improve the object being evaluated -- they help form it by examining the delivery of the program or technology, the quality of its implementation, and the assessment of the organizational context, personnel, procedures, inputs, and so on. Summative evaluations, in contrast, examine the effects or outcomes of some object -- they summarize it by describing what happens subsequent to delivery of the program or technology; assessing whether the object can be said to have caused the outcome; determining the overall impact of the causal factor beyond only the immediate target outcomes; and, estimating the relative costs associated with the object.
Formative evaluation includes several evaluation types:
needs assessment determines who needs the program, how great the need is, and what might work to meet the need
evaluability assessment determines whether an evaluation is feasible and how stakeholders can help shape its usefulness
structured conceptualization helps stakeholders define the program or technology, the target population, and the possible outcomes
implementation evaluation monitors the fidelity of the program or technology delivery
process evaluation investigates the process of delivering the program or technology, including alternative delivery procedures
Summative evaluation can also be subdivided:
outcome evaluations investigate whether the program or technology caused demonstrable effects on specifically defined target outcomes
impact evaluation is broader and assesses the overall or net effects -- intended or unintended -- of the program or technology as a whole
cost-effectiveness and cost-benefit analysis address questions of efficiency by standardizing outcomes in terms of their dollar costs and values
secondary analysis reexamines existing data to address new questions or use methods not previously employed
meta-analysis integrates the outcome estimates from multiple studies to arrive at an overall or summary judgement on an evaluation question
Evaluation Questions and Methods
Evaluators ask many different kinds of questions and use a variety of methods to address them. These are considered within the framework of formative and summative evaluation as presented above.
In formative research the major questions and methodologies are:
What is the definition and scope of the problem or issue, or what's the question?
Formulating and conceptualizing methods might be used including brainstorming, focus groups, nominal group techniques, Delphi methods, brainwriting, stakeholder analysis, synectics, lateral thinking, input-output analysis, and concept mapping.
Where is the problem and how big or serious is it?
The most common method used here is "needs assessment" which can include: analysis of existing data sources, and the use of sample surveys, interviews of constituent populations, qualitative research, expert testimony, and focus groups.
How should the program or technology be delivered to address the problem?
Some of the methods already listed apply here, as do detailing methodologies like simulation techniques, or multivariate methods like multiattribute utility theory or exploratory causal modeling; decision-making methods; and project planning and implementation methods like flow charting, PERT/CPM, and project scheduling.
How well is the program or technology delivered?
Qualitative and quantitative monitoring techniques, the use of management information systems, and implementation assessment would be appropriate methodologies here.
The questions and methods addressed under summative evaluation include:
What type of evaluation is feasible?
Evaluability assessment can be used here, as well as standard approaches for selecting an appropriate evaluation design.
What was the effectiveness of the program or technology?
One would choose from observational and correlational methods for demonstrating whether desired effects occurred, and quasi-experimental and experimental designs for determining whether observed effects can reasonably be attributed to the intervention and not to other sources.
What is the net impact of the program?
Econometric methods for assessing cost effectiveness and cost/benefits would apply here, along with qualitative methods that enable us to summarize the full range of intended and unintended impacts.
Clearly, this introduction is not meant to be exhaustive. Each of these methods, and the many not mentioned, are supported by an extensive methodological research literature. This is a formidable set of tools. But the need to improve, update and adapt these methods to changing circumstances means that methodological research and development needs to have a major place in evaluation work.
http://www.socialresearchmethods.net/kb/intreval.htm
Definitions of Evaluation
Probably the most frequently given definition is:
Evaluation is the systematic assessment of the worth or merit of some object
This definition is hardly perfect. There are many types of evaluations that do not necessarily result in an assessment of worth or merit -- descriptive studies, implementation analyses, and formative evaluations, to name a few. Better perhaps is a definition that emphasizes the information-processing and feedback functions of evaluation. For instance, one might say:
Evaluation is the systematic acquisition and assessment of information to provide useful feedback about some object
Both definitions agree that evaluation is a systematic endeavor and both use the deliberately ambiguous term 'object' which could refer to a program, policy, technology, person, need, activity, and so on. The latter definition emphasizes acquiring and assessing information rather than assessing worth or merit because all evaluation work involves collecting and sifting through data, making judgements about the validity of the information and of inferences we derive from it, whether or not an assessment of worth or merit results.
The Goals of Evaluation
The generic goal of most evaluations is to provide "useful feedback" to a variety of audiences including sponsors, donors, client-groups, administrators, staff, and other relevant constituencies. Most often, feedback is perceived as "useful" if it aids in decision-making. But the relationship between an evaluation and its impact is not a simple one -- studies that seem critical sometimes fail to influence short-term decisions, and studies that initially seem to have no influence can have a delayed impact when more congenial conditions arise. Despite this, there is broad consensus that the major goal of evaluation should be to influence decision-making or policy formulation through the provision of empirically-driven feedback.
Evaluation Strategies
'Evaluation strategies' means broad, overarching perspectives on evaluation. They encompass the most general groups or "camps" of evaluators; although, at its best, evaluation work borrows eclectically from the perspectives of all these camps. Four major groups of evaluation strategies are discussed here.
Scientific-experimental models are probably the most historically dominant evaluation strategies. Taking their values and methods from the sciences -- especially the social sciences -- they prioritize on the desirability of impartiality, accuracy, objectivity and the validity of the information generated. Included under scientific-experimental models would be: the tradition of experimental and quasi-experimental designs; objectives-based research that comes from education; econometrically-oriented perspectives including cost-effectiveness and cost-benefit analysis; and the recent articulation of theory-driven evaluation.
The second class of strategies are management-oriented systems models. Two of the most common of these are PERT, the Program Evaluation and Review Technique, and CPM, the Critical Path Method. Both have been widely used in business and government in this country. It would also be legitimate to include the Logical Framework or "Logframe" model developed at U.S. Agency for International Development and general systems theory and operations research approaches in this category. Two management-oriented systems models were originated by evaluators: the UTOS model where U stands for Units, T for Treatments, O for Observing Observations and S for Settings; and the CIPP model where the C stands for Context, the I for Input, the first P for Process and the second P for Product. These management-oriented systems models emphasize comprehensiveness in evaluation, placing evaluation within a larger framework of organizational activities.
The third class of strategies are the qualitative/anthropological models. They emphasize the importance of observation, the need to retain the phenomenological quality of the evaluation context, and the value of subjective human interpretation in the evaluation process. Included in this category are the approaches known in evaluation as naturalistic or 'Fourth Generation' evaluation; the various qualitative schools; critical theory and art criticism approaches; and, the 'grounded theory' approach of Glaser and Strauss among others.
Finally, a fourth class of strategies is termed participant-oriented models. As the term suggests, they emphasize the central importance of the evaluation participants, especially clients and users of the program or technology. Client-centered and stakeholder approaches are examples of participant-oriented models, as are consumer-oriented evaluation systems.
With all of these strategies to choose from, how to decide? Debates that rage within the evaluation profession -- and they do rage -- are generally battles between these different strategists, with each claiming the superiority of their position. In reality, most good evaluators are familiar with all four categories and borrow from each as the need arises. There is no inherent incompatibility between these broad strategies -- each of them brings something valuable to the evaluation table. In fact, in recent years attention has increasingly turned to how one might integrate results from evaluations that use different strategies, carried out from different perspectives, and using different methods. Clearly, there are no simple answers here. The problems are complex and the methodologies needed will and should be varied.
Types of Evaluation
There are many different types of evaluations depending on the object being evaluated and the purpose of the evaluation. Perhaps the most important basic distinction in evaluation types is that between formative and summative evaluation. Formative evaluations strengthen or improve the object being evaluated -- they help form it by examining the delivery of the program or technology, the quality of its implementation, and the assessment of the organizational context, personnel, procedures, inputs, and so on. Summative evaluations, in contrast, examine the effects or outcomes of some object -- they summarize it by describing what happens subsequent to delivery of the program or technology; assessing whether the object can be said to have caused the outcome; determining the overall impact of the causal factor beyond only the immediate target outcomes; and, estimating the relative costs associated with the object.
Formative evaluation includes several evaluation types:
needs assessment determines who needs the program, how great the need is, and what might work to meet the need
evaluability assessment determines whether an evaluation is feasible and how stakeholders can help shape its usefulness
structured conceptualization helps stakeholders define the program or technology, the target population, and the possible outcomes
implementation evaluation monitors the fidelity of the program or technology delivery
process evaluation investigates the process of delivering the program or technology, including alternative delivery procedures
Summative evaluation can also be subdivided:
outcome evaluations investigate whether the program or technology caused demonstrable effects on specifically defined target outcomes
impact evaluation is broader and assesses the overall or net effects -- intended or unintended -- of the program or technology as a whole
cost-effectiveness and cost-benefit analysis address questions of efficiency by standardizing outcomes in terms of their dollar costs and values
secondary analysis reexamines existing data to address new questions or use methods not previously employed
meta-analysis integrates the outcome estimates from multiple studies to arrive at an overall or summary judgement on an evaluation question
Evaluation Questions and Methods
Evaluators ask many different kinds of questions and use a variety of methods to address them. These are considered within the framework of formative and summative evaluation as presented above.
In formative research the major questions and methodologies are:
What is the definition and scope of the problem or issue, or what's the question?
Formulating and conceptualizing methods might be used including brainstorming, focus groups, nominal group techniques, Delphi methods, brainwriting, stakeholder analysis, synectics, lateral thinking, input-output analysis, and concept mapping.
Where is the problem and how big or serious is it?
The most common method used here is "needs assessment" which can include: analysis of existing data sources, and the use of sample surveys, interviews of constituent populations, qualitative research, expert testimony, and focus groups.
How should the program or technology be delivered to address the problem?
Some of the methods already listed apply here, as do detailing methodologies like simulation techniques, or multivariate methods like multiattribute utility theory or exploratory causal modeling; decision-making methods; and project planning and implementation methods like flow charting, PERT/CPM, and project scheduling.
How well is the program or technology delivered?
Qualitative and quantitative monitoring techniques, the use of management information systems, and implementation assessment would be appropriate methodologies here.
The questions and methods addressed under summative evaluation include:
What type of evaluation is feasible?
Evaluability assessment can be used here, as well as standard approaches for selecting an appropriate evaluation design.
What was the effectiveness of the program or technology?
One would choose from observational and correlational methods for demonstrating whether desired effects occurred, and quasi-experimental and experimental designs for determining whether observed effects can reasonably be attributed to the intervention and not to other sources.
What is the net impact of the program?
Econometric methods for assessing cost effectiveness and cost/benefits would apply here, along with qualitative methods that enable us to summarize the full range of intended and unintended impacts.
Clearly, this introduction is not meant to be exhaustive. Each of these methods, and the many not mentioned, are supported by an extensive methodological research literature. This is a formidable set of tools. But the need to improve, update and adapt these methods to changing circumstances means that methodological research and development needs to have a major place in evaluation work.
http://www.socialresearchmethods.net/kb/intreval.htm
search engine
How to Use Web Search Engines
What follows is a basic explanation of how search engines work. For more detailed and technical information about current methods used by search engines like Google, check out our discussion of Search Engine Ranking Algorithms
Keyword Searching
Refining Your Search
Relevancy Ranking
Meta Tags
Concept-based Searching (This information is dated, but might have historical interest for researchers)
Search engines use automated software programs know as spiders or bots to survey the Web and build their databases. Web documents are retrieved by these programs and analyzed. Data collected from each web page are then added to the search engine index. When you enter a query at a search engine site, your input is checked against the search engine's index of all the web pages it has analyzed. The best urls are then returned to you as hits, ranked in order with the best results at the top.
Keyword Searching
This is the most common form of text search on the Web. Most search engines do their text query and retrieval using keywords.
What is a keyword, exactly? It can simply be any word on a webpage. For example, I used the word "simply" in the previous sentence, making it one of the keywords for this particular webpage in some search engine's index. However, since the word "simply" has nothing to do with the subject of this webpage (i.e., how search engines work), it is not a very useful keyword. Useful keywords and key phrases for this page would be "search," "search engines," "search engine methods," "how search engines work," "ranking" "relevancy," "search engine tutorials," etc. Those keywords would actually tell a user something about the subject and content of this page.
Unless the author of the Web document specifies the keywords for her document (this is possible by using meta tags), it's up to the search engine to determine them. Essentially, this means that search engines pull out and index words that appear to be significant. Since since engines are software programs, not rational human beings, they work according to rules established by their creators for what words are usually important in a broad range of documents. The title of a page, for example, usually gives useful information about the subject of the page (if it doesn't, it should!). Words that are mentioned towards the beginning of a document (think of the "topic sentence" in a high school essay, where you lay out the subject you intend to discuss) are given more weight by most search engines. The same goes for words that are repeated several times throughout the document.
Some search engines index every word on every page. Others index only part of the document.
Full-text indexing systems generally pick up every word in the text except commonly occurring stop words such as "a," "an," "the," "is," "and," "or," and "www." Some of the search engines discriminate upper case from lower case; others store all words without reference to capitalization.
The Problem with Keyword Searching
Keyword searches have a tough time distinguishing between words that are spelled the same way, but mean something different (i.e. hard cider, a hard stone, a hard exam, and the hard drive on your computer). This often results in hits that are completely irrelevant to your query. Some search engines also have trouble with so-called stemming -- i.e., if you enter the word "big," should they return a hit on the word, "bigger?" What about singular and plural words? What about verb tenses that differ from the word you entered by only an "s," or an "ed"?
Search engines also cannot return hits on keywords that mean the same, but are not actually entered in your query. A query on heart disease would not return a document that used the word "cardiac" instead of "heart."
Refining Your Search
Most sites offer two different types of searches--"basic" and "refined" or "advanced." In a "basic" search, you just enter a keyword without sifting through any pull down menus of additional options. Depending on the engine, though, "basic" searches can be quite complex.
Advanced search refining options differ from one search engine to another, but some of the possibilities include the ability to search on more than one word, to give more weight to one search term than you give to another, and to exclude words that might be likely to muddy the results. You might also be able to search on proper names, on phrases, and on words that are found within a certain proximity to other search terms.
Some search engines also allow you to specify what form you'd like your results to appear in, and whether you wish to restrict your search to certain fields on the internet (i.e., usenet or the Web) or to specific parts of Web documents (i.e., the title or URL).
Many, but not all search engines allow you to use so-called Boolean operators to refine your search. These are the logical terms AND, OR, NOT, and the so-called proximal locators, NEAR and FOLLOWED BY.
Boolean AND means that all the terms you specify must appear in the documents, i.e., "heart" AND "attack." You might use this if you wanted to exclude common hits that would be irrelevant to your query.
Boolean OR means that at least one of the terms you specify must appear in the documents, i.e., bronchitis, acute OR chronic. You might use this if you didn't want to rule out too much.
Boolean NOT means that at least one of the terms you specify must not appear in the documents. You might use this if you anticipated results that would be totally off-base, i.e., nirvana AND Buddhism, NOT Cobain.
Not quite Boolean + and - Some search engines use the characters + and - instead of Boolean operators to include and exclude terms.
NEAR means that the terms you enter should be within a certain number of words of each other. FOLLOWED BY means that one term must directly follow the other. ADJ, for adjacent, serves the same function. A search engine that will allow you to search on phrases uses, essentially, the same method (i.e., determining adjacency of keywords).
Phrases: The ability to query on phrases is very important in a search engine. Those that allow it usually require that you enclose the phrase in quotation marks, i.e., "space the final frontier."
Capitalization: This is essential for searching on proper names of people, companies or products. Unfortunately, many words in English are used both as proper and common nouns--Bill, bill, Gates, gates, Oracle, oracle, Lotus, lotus, Digital, digital--the list is endless.
All the search engines have different methods of refining queries. The best way to learn them is to read the help files on the search engine sites and practice!
Relevancy Rankings
Most of the search engines return results with confidence or relevancy rankings. In other words, they list the hits according to how closely they think the results match the query. However, these lists often leave users shaking their heads on confusion, since, to the user, the results may seem completely irrelevant.
Why does this happen? Basically it's because search engine technology has not yet reached the point where humans and computers understand each other well enough to communicate clearly.
Most search engines use search term frequency as a primary way of determining whether a document is relevant. If you're researching diabetes and the word "diabetes" appears multiple times in a Web document, it's reasonable to assume that the document will contain useful information. Therefore, a document that repeats the word "diabetes" over and over is likely to turn up near the top of your list.
If your keyword is a common one, or if it has multiple other meanings, you could end up with a lot of irrelevant hits. And if your keyword is a subject about which you desire information, you don't need to see it repeated over and over--it's the information about that word that you're interested in, not the word itself.
Some search engines consider both the frequency and the positioning of keywords to determine relevancy, reasoning that if the keywords appear early in the document, or in the headers, this increases the likelihood that the document is on target. For example, one method is to rank hits according to how many times your keywords appear and in which fields they appear (i.e., in headers, titles or plain text). Another method is to determine which documents are most frequently linked to other documents on the Web. The reasoning here is that if other folks consider certain pages important, you should, too.
If you use the advanced query form on AltaVista, you can assign relevance weights to your query terms before conducting a search. Although this takes some practice, it essentially allows you to have a stronger say in what results you will get back.
As far as the user is concerned, relevancy ranking is critical, and becomes more so as the sheer volume of information on the Web grows. Most of us don't have the time to sift through scores of hits to determine which hyperlinks we should actually explore. The more clearly relevant the results are, the more we're likely to value the search engine.
Information On Meta Tags
Some search engines are now indexing Web documents by the Meta tags in the documents' HTML (at the beginning of the document in the so-called "head" tag). What this means is that the Web page author can have some influence over which keywords are used to index the document, and even in the description of the document that appears when it comes up as a search engine hit.
This is obviously very important if you are trying to draw people to your website based on how your site ranks in search engines hit lists.
There is no perfect way to ensure that you'll receive a high ranking. Even if you do get a great ranking, there's no assurance that you'll keep it for long. For example, at one period a page from the Spider's Apprentice was the number- one-ranked result on Altavista for the phrase "how search engines work." A few months later, however, it had dropped lower in the listings.
There is a lot of conflicting information out there on meta-tagging. If you're confused it may be because different search engines look at Meta tags in different ways. Some rely heavily on Meta tags, others don't use them at all. The general opinion seems to be that Meta tags are less useful than they were a few years ago, largely because of the high rate of spam dexing (web authors using false and misleading keywords in the Meta tags).
Note: Google, currently the most popular search engine, does not index the keyword meta tags. Be aware of this is you are optimizing your webpage for the Google engine.
It seems to be generally agreed that the "title" and the "description" Meta tags are important to write effectively, since several major search engines use them in their indices. Use relevant keywords in your title, and vary the titles on the different pages that make up your website, in order to target as many keywords as possible. As for the "description" Meta tag, some search engines will use it as their short summary of your url, so make sure your description is one that will entice surfers to your site.
Note: The "description" Meta tag is generally held to be the most valuable, and the most likely to be indexed, so pay special attention to this one.
In the keyword tag, list a few synonyms for keywords, or foreign translations of keywords (if you anticipate traffic from foreign surfers). Make sure the keywords refer to, or are directly related to, the subject or material on the page. Do NOT use false or misleading keywords in an attempt to gain a higher ranking for your pages.
The "keyword" Meta tag has been abused by some webmasters. For example, a recent ploy has been to put such words "sex" or "mp3" into keyword meta tags, in hopes of luring searchers to one's website by using popular keywords.
The search engines are aware of such deceptive tactics, and have devised various methods to circumvent them, so be careful. Use keywords that are appropriate to your subject, and make sure they appear in the top paragraphs of actual text on your webpage. Many search engine algorithms score the words that appear towards the top of your document more highly than the words that appear towards the bottom. Words that appear in HTML header tags (H1, H2, H3, etc) are also given more weight by some search engines. It sometimes helps to give your page a file name that makes use of one of your prime keywords, and to include keywords in the "alt" image tags.
One thing you should not do is use some other company's trademarks in your meta tags. Some website owners have been sued for trademark violations because they've used other company names in the Meta tags. I have, in fact, testified as an expert witness in such cases. You do not want the expense of being sued!
Remember that all the major search engines have slightly different policies. If you're designing a website and meta-tagging your documents, we recommend that you take the time to check out what the major search engines say in their help files about how they each use meta tags. You might want to optimize your meta tags for the search engines you believe are sending the most traffic to your site.
Concept-based searching (The following information is out-dated, but might have historical interest for researchers)
Excite used to be the best-known general-purpose search engine site on the Web that relies on concept-based searching. It is now effectively extinct.
Unlike keyword search systems, concept-based search systems try to determine what you mean, not just what you say. In the best circumstances, a concept-based search returns hits on documents that are "about" the subject/theme you're exploring, even if the words in the document don't precisely match the words you enter into the query.
How did this method work? There are various methods of building clustering systems, some of which are highly complex, relying on sophisticated linguistic and artificial intelligence theory that we won't even attempt to go into here. Excite used to a numerical approach. Excite's software determines meaning by calculating the frequency with which certain important words appear. When several words or phrases that are tagged to signal a particular concept appear close to each other in a text, the search engine concludes, by statistical analysis, that the piece is "about" a certain subject. For example, the word heart, when used in the medical/health context, would be likely to appear with such words as coronary, artery, lung, stroke, cholesterol, pump, blood, attack, and arteriosclerosis. If the word heart appears in a document with others words such as flowers, candy, love, passion, and valentine, a very different context is established, and a concept-oriented search engine returns hits on the subject of romance.
What follows is a basic explanation of how search engines work. For more detailed and technical information about current methods used by search engines like Google, check out our discussion of Search Engine Ranking Algorithms
Keyword Searching
Refining Your Search
Relevancy Ranking
Meta Tags
Concept-based Searching (This information is dated, but might have historical interest for researchers)
Search engines use automated software programs know as spiders or bots to survey the Web and build their databases. Web documents are retrieved by these programs and analyzed. Data collected from each web page are then added to the search engine index. When you enter a query at a search engine site, your input is checked against the search engine's index of all the web pages it has analyzed. The best urls are then returned to you as hits, ranked in order with the best results at the top.
Keyword Searching
This is the most common form of text search on the Web. Most search engines do their text query and retrieval using keywords.
What is a keyword, exactly? It can simply be any word on a webpage. For example, I used the word "simply" in the previous sentence, making it one of the keywords for this particular webpage in some search engine's index. However, since the word "simply" has nothing to do with the subject of this webpage (i.e., how search engines work), it is not a very useful keyword. Useful keywords and key phrases for this page would be "search," "search engines," "search engine methods," "how search engines work," "ranking" "relevancy," "search engine tutorials," etc. Those keywords would actually tell a user something about the subject and content of this page.
Unless the author of the Web document specifies the keywords for her document (this is possible by using meta tags), it's up to the search engine to determine them. Essentially, this means that search engines pull out and index words that appear to be significant. Since since engines are software programs, not rational human beings, they work according to rules established by their creators for what words are usually important in a broad range of documents. The title of a page, for example, usually gives useful information about the subject of the page (if it doesn't, it should!). Words that are mentioned towards the beginning of a document (think of the "topic sentence" in a high school essay, where you lay out the subject you intend to discuss) are given more weight by most search engines. The same goes for words that are repeated several times throughout the document.
Some search engines index every word on every page. Others index only part of the document.
Full-text indexing systems generally pick up every word in the text except commonly occurring stop words such as "a," "an," "the," "is," "and," "or," and "www." Some of the search engines discriminate upper case from lower case; others store all words without reference to capitalization.
The Problem with Keyword Searching
Keyword searches have a tough time distinguishing between words that are spelled the same way, but mean something different (i.e. hard cider, a hard stone, a hard exam, and the hard drive on your computer). This often results in hits that are completely irrelevant to your query. Some search engines also have trouble with so-called stemming -- i.e., if you enter the word "big," should they return a hit on the word, "bigger?" What about singular and plural words? What about verb tenses that differ from the word you entered by only an "s," or an "ed"?
Search engines also cannot return hits on keywords that mean the same, but are not actually entered in your query. A query on heart disease would not return a document that used the word "cardiac" instead of "heart."
Refining Your Search
Most sites offer two different types of searches--"basic" and "refined" or "advanced." In a "basic" search, you just enter a keyword without sifting through any pull down menus of additional options. Depending on the engine, though, "basic" searches can be quite complex.
Advanced search refining options differ from one search engine to another, but some of the possibilities include the ability to search on more than one word, to give more weight to one search term than you give to another, and to exclude words that might be likely to muddy the results. You might also be able to search on proper names, on phrases, and on words that are found within a certain proximity to other search terms.
Some search engines also allow you to specify what form you'd like your results to appear in, and whether you wish to restrict your search to certain fields on the internet (i.e., usenet or the Web) or to specific parts of Web documents (i.e., the title or URL).
Many, but not all search engines allow you to use so-called Boolean operators to refine your search. These are the logical terms AND, OR, NOT, and the so-called proximal locators, NEAR and FOLLOWED BY.
Boolean AND means that all the terms you specify must appear in the documents, i.e., "heart" AND "attack." You might use this if you wanted to exclude common hits that would be irrelevant to your query.
Boolean OR means that at least one of the terms you specify must appear in the documents, i.e., bronchitis, acute OR chronic. You might use this if you didn't want to rule out too much.
Boolean NOT means that at least one of the terms you specify must not appear in the documents. You might use this if you anticipated results that would be totally off-base, i.e., nirvana AND Buddhism, NOT Cobain.
Not quite Boolean + and - Some search engines use the characters + and - instead of Boolean operators to include and exclude terms.
NEAR means that the terms you enter should be within a certain number of words of each other. FOLLOWED BY means that one term must directly follow the other. ADJ, for adjacent, serves the same function. A search engine that will allow you to search on phrases uses, essentially, the same method (i.e., determining adjacency of keywords).
Phrases: The ability to query on phrases is very important in a search engine. Those that allow it usually require that you enclose the phrase in quotation marks, i.e., "space the final frontier."
Capitalization: This is essential for searching on proper names of people, companies or products. Unfortunately, many words in English are used both as proper and common nouns--Bill, bill, Gates, gates, Oracle, oracle, Lotus, lotus, Digital, digital--the list is endless.
All the search engines have different methods of refining queries. The best way to learn them is to read the help files on the search engine sites and practice!
Relevancy Rankings
Most of the search engines return results with confidence or relevancy rankings. In other words, they list the hits according to how closely they think the results match the query. However, these lists often leave users shaking their heads on confusion, since, to the user, the results may seem completely irrelevant.
Why does this happen? Basically it's because search engine technology has not yet reached the point where humans and computers understand each other well enough to communicate clearly.
Most search engines use search term frequency as a primary way of determining whether a document is relevant. If you're researching diabetes and the word "diabetes" appears multiple times in a Web document, it's reasonable to assume that the document will contain useful information. Therefore, a document that repeats the word "diabetes" over and over is likely to turn up near the top of your list.
If your keyword is a common one, or if it has multiple other meanings, you could end up with a lot of irrelevant hits. And if your keyword is a subject about which you desire information, you don't need to see it repeated over and over--it's the information about that word that you're interested in, not the word itself.
Some search engines consider both the frequency and the positioning of keywords to determine relevancy, reasoning that if the keywords appear early in the document, or in the headers, this increases the likelihood that the document is on target. For example, one method is to rank hits according to how many times your keywords appear and in which fields they appear (i.e., in headers, titles or plain text). Another method is to determine which documents are most frequently linked to other documents on the Web. The reasoning here is that if other folks consider certain pages important, you should, too.
If you use the advanced query form on AltaVista, you can assign relevance weights to your query terms before conducting a search. Although this takes some practice, it essentially allows you to have a stronger say in what results you will get back.
As far as the user is concerned, relevancy ranking is critical, and becomes more so as the sheer volume of information on the Web grows. Most of us don't have the time to sift through scores of hits to determine which hyperlinks we should actually explore. The more clearly relevant the results are, the more we're likely to value the search engine.
Information On Meta Tags
Some search engines are now indexing Web documents by the Meta tags in the documents' HTML (at the beginning of the document in the so-called "head" tag). What this means is that the Web page author can have some influence over which keywords are used to index the document, and even in the description of the document that appears when it comes up as a search engine hit.
This is obviously very important if you are trying to draw people to your website based on how your site ranks in search engines hit lists.
There is no perfect way to ensure that you'll receive a high ranking. Even if you do get a great ranking, there's no assurance that you'll keep it for long. For example, at one period a page from the Spider's Apprentice was the number- one-ranked result on Altavista for the phrase "how search engines work." A few months later, however, it had dropped lower in the listings.
There is a lot of conflicting information out there on meta-tagging. If you're confused it may be because different search engines look at Meta tags in different ways. Some rely heavily on Meta tags, others don't use them at all. The general opinion seems to be that Meta tags are less useful than they were a few years ago, largely because of the high rate of spam dexing (web authors using false and misleading keywords in the Meta tags).
Note: Google, currently the most popular search engine, does not index the keyword meta tags. Be aware of this is you are optimizing your webpage for the Google engine.
It seems to be generally agreed that the "title" and the "description" Meta tags are important to write effectively, since several major search engines use them in their indices. Use relevant keywords in your title, and vary the titles on the different pages that make up your website, in order to target as many keywords as possible. As for the "description" Meta tag, some search engines will use it as their short summary of your url, so make sure your description is one that will entice surfers to your site.
Note: The "description" Meta tag is generally held to be the most valuable, and the most likely to be indexed, so pay special attention to this one.
In the keyword tag, list a few synonyms for keywords, or foreign translations of keywords (if you anticipate traffic from foreign surfers). Make sure the keywords refer to, or are directly related to, the subject or material on the page. Do NOT use false or misleading keywords in an attempt to gain a higher ranking for your pages.
The "keyword" Meta tag has been abused by some webmasters. For example, a recent ploy has been to put such words "sex" or "mp3" into keyword meta tags, in hopes of luring searchers to one's website by using popular keywords.
The search engines are aware of such deceptive tactics, and have devised various methods to circumvent them, so be careful. Use keywords that are appropriate to your subject, and make sure they appear in the top paragraphs of actual text on your webpage. Many search engine algorithms score the words that appear towards the top of your document more highly than the words that appear towards the bottom. Words that appear in HTML header tags (H1, H2, H3, etc) are also given more weight by some search engines. It sometimes helps to give your page a file name that makes use of one of your prime keywords, and to include keywords in the "alt" image tags.
One thing you should not do is use some other company's trademarks in your meta tags. Some website owners have been sued for trademark violations because they've used other company names in the Meta tags. I have, in fact, testified as an expert witness in such cases. You do not want the expense of being sued!
Remember that all the major search engines have slightly different policies. If you're designing a website and meta-tagging your documents, we recommend that you take the time to check out what the major search engines say in their help files about how they each use meta tags. You might want to optimize your meta tags for the search engines you believe are sending the most traffic to your site.
Concept-based searching (The following information is out-dated, but might have historical interest for researchers)
Excite used to be the best-known general-purpose search engine site on the Web that relies on concept-based searching. It is now effectively extinct.
Unlike keyword search systems, concept-based search systems try to determine what you mean, not just what you say. In the best circumstances, a concept-based search returns hits on documents that are "about" the subject/theme you're exploring, even if the words in the document don't precisely match the words you enter into the query.
How did this method work? There are various methods of building clustering systems, some of which are highly complex, relying on sophisticated linguistic and artificial intelligence theory that we won't even attempt to go into here. Excite used to a numerical approach. Excite's software determines meaning by calculating the frequency with which certain important words appear. When several words or phrases that are tagged to signal a particular concept appear close to each other in a text, the search engine concludes, by statistical analysis, that the piece is "about" a certain subject. For example, the word heart, when used in the medical/health context, would be likely to appear with such words as coronary, artery, lung, stroke, cholesterol, pump, blood, attack, and arteriosclerosis. If the word heart appears in a document with others words such as flowers, candy, love, passion, and valentine, a very different context is established, and a concept-oriented search engine returns hits on the subject of romance.
Information Literacy
by
Kristen Nelson, Pearson Skylight Professional
For many teachers, their memories of completing a research report focus on being assigned a topic, being sent to the library, feeling lucky if they could find more than one resource (with the encyclopedia serving as the main source), and writing up the report using their own words. Most frequently, their biggest problem was not being able to find enough information in outdated books and encyclopedias.
Fast forward to the twenty-first century. The amount of information available over the Internet, on the news, and in newspapers, magazines, and books is astonishing and overwhelming. Students now are literally surrounded with Web pages of information, CD-ROMs with interactive programs, books, magazines, and other multimedia products. Most frequently, the biggest problem students face is finding too much information and not knowing what to do with it. Before students can be taught to understand concepts and skills and be asked to use their multiple intelligences, they need specific tools to work with the large amount of information at their fingertips. Without these skills, students feel like innocent lambs being thrown to the information wolves.
Disturbing trends about finding information and doing research are developing in students. The top three are (1) students believing that anything from a computer is better than anything that comes from a book, (2) students viewing the library as a last resort, and (3) students being more concerned with the quantity than the quality of their sources. 'It is critical that students learn to find information from many sources and be able to analyze its quality relatively quickly. Only then are they able to move to the next step of using the information to produce a piece of work. These searching and analyzing skills are information literacy skills, and the sooner teachers begin helping students learn them, the better the students' chances are of succeeding in the Information Age.
In a time when many are crying for back to basics in schools throughout the United States, teachers need to carefully evaluate what the basics are for students living in the twenty-first century. Reading, writing, and arithmetic are still at the top of the list, but basic skills also include being able to find, analyze, and work with information. Teachers can no longer expect to fill their students' heads with content and assume the students are prepared for the future. Information literacy skills can now join reading, writing, and arithmetic as basic skills of the twenty-first century.
Michael Eisenberg and Doug Johnson (1996) propose six components of information literacy skills in their Big Six Skills approach* Figure 5.1 briefly lists the information literacy skills with the Bloom taxonomy skill that it relates to in parentheses. Teachers may want to post this list in their classrooms, because these skills need to be seen and discussed on a regular basis.
สมัครสมาชิก:
บทความ (Atom)