Home     PhD Contents     PhD Chap 1     PhD Chap 2     PhD Chap 3     PhD Chap 4     PhD Chap 5     PhD Chap 6     PhD Chap 7     PhD Chap 8     PhD Chap 9     PhD Chap 10     PhD Chap 11     PhD Bibliography     PhD Appendices     PhD Plans    [PhD PDF]

7 Quartz recognition experiment

[Please note that part of this chapter has been published in the journal Archaeometry]

"Typological systems are essential for communication...as well as for interpretive purposes. For both communication and interpretation, it is important to know that different individuals using the same typology classify artifacts in similar ways, but the consistency with which typologies are used is rarely evaluated or explicitly tested...Typologies are "sacred knowledge" acquired as a rite of passage into professional status, and the ability to classify things correctly is a basic professional skill about which some archaeologists are very sensitive" (Whittaker et al. 1998, 129, 132, emphasis added).

7.1 Introduction

This chapter discusses the two quartz recognition experiments conducted on volunteer participants at the Belderrig Summer 2008 excavations and at the WAC conference, Dublin 2008. These two related experiments, which were dubbed as ‘quartz quizzes’, were undertaken in order to understand how quartz artefacts are indentified and classified by different people.  In Section 7.2 I introduce the reasons for the quizzes, discuss the considerations and decisions regarding the formulation and conducting of them, and introduce some previous research pertaining to examining researchers’ consistencies and biases in analysing and classifying artefacts and interobserver biases. Section 7.3 details the quiz presented to the conference attendees at WAC, Dublin 2008. Section 7.4 concludes the chapter with an overview and discussion. For reasons outlined in Section 7.2, the results of the quiz at Belderrig are dealt with summarily, and are provided in Appendix A- 48.

7.2 Formulating and introducing the quizzes

The initial analysis of the experimental assemblage made it clear that the correct identification of a given artefact’s place in the chaîne opératoire was often problematical, especially with non-proximal flake fragments and core fragments, thus leading to incorrect interpretations. As the experimental assemblage’s debitage had been for the most part divided into strike piles, the fragmentation of a given flake could be reconstructed through conjoining – once this possibility of conjoining was taken away and the analysis was done ‘blind’, as would be the case in an archaeological assemblage, the results were quite different. Leading from discussions concerning this, Graeme Warren suggested that members of the 2008 Belderrig excavation team could be asked to participate in a ‘quartz quiz’ which would be devised in order to understand how people with a variety of lithic analysis skills identified and classified quartz artefacts. Following this quiz, called Quiz A, which was held in Belderrig in June 2008, I was invited to run the quiz again at the WAC conference, called Quiz B, which was taking place in UCD the following month. It was hoped that by holding the quiz at the week-long conference a cross section of both Irish and international archaeologists and especially lithic analysts – including hopefully some with experience in analysing quartz – would volunteer to participate.

As detailed in Chapter 5, when introducing the quiz to the participants the pieces presented to them were described as 'Belderrig quartz', alluding that they were from the Belderrig excavations – they were in fact made from Belderrig quartz, but were not prehistoric artefacts. This was done for two reasons
1. In order maintain the possibility that some pieces were not anthropogenic in origin – as one might anticipate in an archaeological assemblage but not in an experimental one.
2. was concerned that if the quiz had taken place with participants knowing that a definitive answer existed interpretations may have been affected by this knowledge.
Both factors seemed very important in terms of testing an archaeological typology (where prior knowledge does not usually exist). It was, of course, essential that all participants took the quiz under the same conditions – hence not being able to respond correctly to the direct questions. I would like to apologise to all participants for misleading them about the nature of the materials they were examining and I hope that they understand the logic behind the decision. I want to stress that no attempt whatsoever was made to ‘antique’ the quartz pieces; the pieces shown to the participants were in the same condition as when they were knapped. While a few participants commented that a few of the pieces appeared ‘fresh’, for the most part there were no objections voiced as to the antiquity of the pieces.

A selection of 30 pieces of quartz from the experimental assemblage was initially chosen for Quiz A; 20 of these pieces were subsequently used for the Quiz B (Table 7-1 and Appendix A- 42). The reasons for only presenting 20 of the original 30 pieces for Quiz B was that it was suggested in feedback from the first quiz that a shorter quiz would attract more participants which would outweigh the subsequent loss of some information. The pieces were chosen because they were representative flakes and cores knapped by both bipolar and direct percussion, with 30% being bipolar cores and flakes (Appendix B- 10, Appendix B- 11, Appendix B- 12, Appendix B- 13). The pieces were graded into what I (subjectively) considered would be levels of difficulty – difficulty one being easy, difficulty two moderately difficult, and difficulty three being difficult (Table 7-1). This grading into levels of difficulty was done in order for me to then compare my perception of a given artefact’s difficulty in recognition compared to the participants. For the Quiz B selection, a greater proportion of ‘difficulty 2’ pieces were excluded, mainly a variety of platform flake fragments; these were excluded in order for a balanced selection of other types and difficulty levels to remain included. The 30 pieces were the results of ten experimental knapping events, an ‘event’ signifying the knapping of one parent core (Figure 7-1). The four platform core fragments selected were all conjoins of one core, while there were four other sets of conjoined platform flakes; for the 20 pieces examined by both quizzes, two of the conjoined flake sets were not examined, and one of the core fragments that formed the conjoin was excluded, leaving a set of three conjoined core fragments.

Piece Quiz A Quiz B
Bipolar Diff. 1 Diff. 2 Diff. 3 Total Diff. 1 Diff. 2 Diff. 3 Total
Core, complete 4 - - 4 4 - - 4
Flake, complete - 2 2 4 - 1 1 2
Core, complete 1 - - 1 1 - - 1
Core, fragment - - 4 4 - - 3 3
Flake, complete 3 - - 3 2 - - 2
Flake, fragment 6 6 2 14 4 2 2 8
Total 14 8 8 30 11 3 6 20

Table 7-1 Selected quiz pieces – Diff. = Difficulty level

Due to the different terminologies that various people use in describing debitage (discussed previously, see Section 5.4), I asked all the participants to use the terminology from Inizan et al. (1999) which was available for them to examine if the terminology was unfamiliar, and also presented each participant with a sheet (Appendix A- 43)[15] containing a diagram showing the structure of the basic terminology for classifying the pieces and a list of attributes, such as fragmentation, flake termination, striking marks etc. to note where possible. I asked the participants to give as much or as little information as they so wished. Figure 7-2 shows the classificatory structure used for the quiz. The first level of identification is ‘piece’, which consisted of either cores or debitage; or they could be described as natural, or as indeterminate if the interpretation was unclear. The second level is ‘type’ which for cores were multiplatform and bipolar and so forth, and for debitage were flakes and blades, or debris; the third level is ‘sub-type’ which for debitage was regular platform flakes, irregular platform flakes, and bipolar flakes – a bipolar flake could, of course, also be regular or irregular, but this was not included on the sheet provided.[16]

In terms of the debitage, I was not concerned whether the participants classified a piece as either a flake or blade, or regular or irregular. Instead, I was interested in classifications of debitage as either flake/blade or debris, and identifications of the sub-types of either platform or bipolar technology. As well as those shown, other types are possible such as cores that show signs of a combination of platform and bipolar reduction.

While discussing the quiz and the terminology of Inizan et al. (1999) with the participants of Quiz A at Belderrig, I explained that a given piece could be a core, debitage, indeterminate, or natural. The latter two classificatory choices were not written down on the sheet provided, but given verbally. For Quiz B at WAC, however, the categories ‘natural’ and ‘indeterminate’ were written on the sheet alongside ‘core’ and ‘debitage’. This written addition dramatically altered the responses to the pieces. It appears that the participants of Quiz A implicitly assumed the pieces presented to them to be artefactual, whereas the participants of Quiz B did not have this assumption – Table 7-2 highlights the significant difference in responses for the 20 pieces that both quizzes looked at: while only 1% of the responses in Quiz A were either ‘natural’ or ‘indeterminate’, in Quiz B this jumped to 21%; looking only at participants from Quiz B that had ‘substantial’ lithic experience did not alter the percentage, therefore this change is not related to experience with lithics.

quartz lithic recognition

Figure 7-1 Relationships of pieces examined during quizzes

quartz lithic classification tree

Figure 7-2 Classificatory structure for quiz artefacts

It is probable that this difference is due specifically to the presentation format of the quiz, or else because of the fact that Quiz A was undertaken at Belderrig in the vicinity of the excavation, therefore allowing an assumption that the pieces were all excavated there. Because of this significant difference between the two quizzes at the first level of classification their data was not combined and analysed together as had been initially planned. Therefore, the results of the two quizzes will be dealt with separately, with the main focus of analysis on Quiz B. A summary of the results from the Belderrig quiz are provided in Appendix A- 48.

Response Quiz A (%) Quiz B (%)
Indeterminate 1.03 12.66
Natural - 8.94
Total 1.03 21.16

Table 7-2 Responses to 20 pieces seen by both quizzes

An interesting result of this lack of the explicit categories of natural and indeterminate for Quiz A was that its participants nevertheless did not categorise the experimental assemblage more accurately even though they did not use those categories. Instead, they used the category ‘debris’ more often than the participants of Quiz B, and the frequencies of use for ‘debris’ matched the ‘natural’ and ‘indeterminate’ pieces of Quiz B. Therefore, one can surmise that while the participants of Quiz A were explicitly asked to use the term ‘debris’ as used by Inizan et al. (1999), some of them may in fact have ‘broadened’ its meaning to suit their needs in attempting to classify pieces they were unsure about.

Top of Page

7.3 Quiz B - WAC, Dublin 2008

7.3.1 Participants' background and skill levels

A major aim of the quiz was to determine how differing skill levels affected the identification and classification of a quartz assemblage. The participants were asked to rate and give details concerning their archaeological experience, lithic experience, and quartz experience – here, ‘lithic’ stood for stone in general, while ‘quartz’ meant specifically quartz. Four categories of experience were provided – none, student, minor, and substantial. The category of ‘student’ was used in order to distinguish those participants who would have had some experience with fieldwork, lithics, or quartz as a student such as in introductory lithic analysis classes and so forth, but less than ‘minor’ experience. The participants were asked to also provide information on their countries of archaeological education and work, their level of education, and their current position.

47 people attending WAC volunteered to participate in the quartz quiz. These represented archaeologists who received their archaeological education in 17 countries, with 24 countries listed as their main countries of work. All continents bar Asia (and Antarctica) were listed as places of archaeological education; 43% were primarily educated in Ireland and Britain; 19% in Australasia, and 15% in America. The participants ranged from senior archaeologists with a wealth of archaeological and lithic analysis experience to students with little. Table 7-3 shows the breakdown of the participants in terms of their current occupation and experience with lithics and quartz; 62% of the participants described themselves as having substantial experience with lithics, while 23% had substantial quartz experience. Of those with no quartz experience, 36% had substantial lithic experience and 27% had minor lithic experience.

Current position   Lithic experience Quartz experience
Total None Student Minor Substantial None Student Minor Substantial
Field archaeologist 21 - 2 7 12 3 2 9 7
Lecturer 8 - - 3 5 1 1 6 -
Researcher 5 - - - 5 - 1 1 3
PhD Student 5 - - - 5 4 1 - -
Undergraduate 4 1 3 - - 2 2 - -
Museum curator 1 - - 1 - - - 1 -
Petrologist 1 1 - - - 1 - - -
Retired 1 - - - 1 - - 1 -
Unemployed 1 - - - 1 - - - 1
Total 47 2 5 11 29 11 7 18 11

Table 7-3 Quiz B - Occupation and lithic and quartz experience.

For the 11 participants with substantial quartz experience 36% work in Australia, 27% in Scandinavia, with rest working in Britain or North America and most were educated where they work (two listed more than one work location including Africa, India, and France). For the participants with substantial lithic experience (but excluding the 11 with substantial quartz experience mentioned above) 51% work in Ireland or Britain, 17% in Australia, 16% in America with the rest working in Continental Europe (some listed more than one work location including Israel, Africa, and South America). Again, almost all participants work where educated. Table 7-3 highlights that almost half of the participants described themselves as field archaeologists and 20% were current students. Only one participant, a petrologist, had no immediate connection with archaeology, and along with him one other (an undergraduate) had no experience with lithic analysis.

Top of Page

7.3.2 First level identification - Piece: core or debitage

Looking at the first, basic level of classification, of whether a given piece was either a core or debitage, the overall correct response rate was just 61%, with the correct responses being significantly higher for the actual debitage pieces (75%) compared to the actual core pieces (39%). Those with substantial quartz experience had greater success (Table 7-4) – it should be remembered that four out of the 11 participants with no quartz experience described themselves as having substantial lithic experience.

Quartz Experience Actual Piece (%)
All (n=20) Core (n=8) Debitage (n=12)
None (11) 57 32 74
Student (7) 60 36 76
Minor (18) 60 40 73
Substantial (11) 66 47 80
All (47) 61 39 75

Table 7-4 Correct responses to Piece category

It is clear that for all levels of experience the success rate was low, but a more interesting question is what the respondents described the pieces as. Looking at what these incorrect responses were, Appendix A- 44 shows that for the incorrectly identified cores, the misidentification was split between describing them as debitage or as natural or indeterminate; for the substantial quartz experience cohort however, they were more likely to describe them as debitage. For the misidentified debitage, the pieces were twice as likely to be misidentified as natural or indeterminate instead of as cores, apart from the student quartz experience cohort who misidentified them as cores.

Top of Page

7.3.3 Second level identification - Types

The identification of types is divided between core types and debitage types (Figure 7-2). The initial difficulty with analysing the results of the ‘type’ identification relates to the large percentage of non-responses for the core types in this category. This omission of the type could imply that the participant was unsure of the type and decided to leave it simply as a generic ‘core’, or that certain participants systematically did not answer the category, or else, on occasion, forgot to fill in that item; a combination of all three of these possibilities is more than likely. Cores

Beginning with core types, the correct response rate for bipolar and multiplatform cores was very low. Table 7-5 gives the response rates for the cores collectively and by type, and also for responses when the piece was correctly identified as a core. Overall, 10% of the responses for core type were correct; looking at only those responses where the piece was correctly identified i.e. when the participants knew they were looking at a core, this rises to 25% (Table 7-5). For those who correctly identified a piece as a core, for bipolar cores their success rate was 18% and 32% for platform cores.

Overall Score by Core types (% Score when Piece was correct by Core types (%)
All Core (n=8) Bipolar,
Comp. (n=1)
All Core (n=8) Bipolar,
Comp. (n=4)
None (11) 9 5 36 6 29 17 50 25
Student (7) 7 - 14 14 20 - 25 43
Minor (18) 10 10 22 6 24 22 29 25
Substantial (11) 13 11 46 3 27 22 45 14
All 10 7 30 6 25 18 38 26

Table 7-5 Scores for core type; overall scores and scores when correctly identified as a core

Appendix A- 45 gives the responses for those who correctly identified a given artefact as a core. This table highlights that for all the cores, the category of type was left blank a significant amount of the time, even for the substantial quartz experience cohort. The piece that caused the least difficulty in identification was the complete platform core – as Table 7-1 shows, this ease in identification was predicted as this piece had been given a difficulty level of 1; in fact, this piece was included in the quiz as an ‘easy’ piece, to show the participants a familiar shaped object, i.e. a clear-cut multiplatform core. For the bipolar cores however, there was a consistent misidentification of them as bipolar cores, even though I had surmised that all of these would be easily identified, and had assigned all of them a difficulty level of 1. This proved incorrect, and these four bipolar cores were identified as cores for only 41% of the responses; the substantial quartz experience cohort identified them as cores for just 52% of the responses. Looking at responses when they were correctly identified as cores, they were identified as bipolar for 18% of the overall responses and 22% for the substantial quartz experience cohort (Appendix A- 45).

For the three platform core fragments (see Appendix B- 8) I had surmised that due to their morphologies, two of them would be described as cores, while one would be identified as debitage. Table 7-6 shows the piece category responses for the three core fragments with the type and sub-type categories of ‘debris’ and ‘flake/blade, bipolar’ as well; these are highlighted due to the high response rate for them. This table highlights that for the first core fragment there was little consensus on what the artefact was, with few calling it a core; for the second fragment there was more general agreement that it was a core, with those with less experience with quartz faring better. For the third core fragment, all experience categories were more decisively incorrect in calling it debitage, and compared to the other two core fragments all were less likely to describe it as debris. Therefore, most considered this to be a flake, and a significant amount regarded this core fragment as a bipolar flake/blade. What is interesting about this latter identification is that there were only a few correct responses for the two bipolar flakes presented during the quiz; while there were a total of 38 responses of ‘bipolar flake’ or ‘possible bipolar flake’, only five of these were for actual bipolar flakes which gives a correct response rate of just 5%.

Exp # Quartz Experience Core (%) Debitage
All (%)
(Debitage – Debris) (%) (Debitage
– Bipolar) (%)
Natural (%)
2021 None (11) 9 45 (45) - 27 18
Student (7) 29 29 (29) - 14 29
Minor (18) 17 17 (11) - 33 33
Substantial (11) 9 27 (27) - 45 18
2022 None (11) 64 18 (18) - - 18
Student (7) 71 29 (29) - - -
Minor (18) 44 28 (17) (6) 11 17
Substantial (11) 55 27 (9) (18) 18 -
2023 None (11) - 73 - (9) 27 -
Student (7) - 100 (14) (29) - -
Minor (18) 6 89 (17) (17) 6 -
Substantial (11) - 82 (9) (27) 18 -

Table 7-6 Responses for three platform core fragments. Responses for all ‘debitage’ are provided and also have been subdivided, highlighting the responses of ‘debitage-debris’ and ‘debitage-bipolar’ in brackets. E.g. for the first row all of the 45% of ‘debitage’ responses were ‘debris’ while for the last row the 82% of ‘debitage’ consisted partially of 9% ‘debris’ and 27% ‘bipolar’

Top of Page Debitage

Looking at the typing of the debitage, which concerned categorising into flakes or debris it is clear that debitage was easier to type than cores. Nevertheless the overall success rate was still low at 65%, with the bipolar flakes the easiest for all levels of experience to type (Table 7-7). As noted above, not to sub-type as bipolar flakes (see also, below p. 160). Appendix A- 46 lists the responses, highlighting that when incorrect the complete flakes were often misidentified as cores or as indeterminate/natural; the flake fragments were more often described as indeterminate/natural and sometimes as debris; and the bipolar flakes were described as debris or as indeterminate/natural.

Quartz experience Flake, Complete
(n=2) (%)
Flake, Fragment
(n=8) (%)
Flake, Bipolar,
Complete (n=2) (%)
None (11) 50 65 68
Student (7) 43 64 86
Minor (18) 42 65 75
Substantial (11) 59 72 91
All 49 66 79

Table 7-7 Correct response rate for debitage type by sub-type

Top of Page

7.3.4 Third level identification - Debitage sub-type

The debitage sub-type asked the participants to divide the debitage between regular and irregular platform flakes and bipolar flakes (Figure 7-2). The non-response rate to the debitage sub-type category was high at 64% - this percentage accounts for only those who correctly identified a given artefact as a flake. Overall, the success rate for identifying the sub-type of the debitage was only 16%, with this rising to 25% when only correct responses for debitage are counted (Table 7-8). Appendix A- 47 lists the responses for debitage sub-type when the type category was correctly identified. The bipolar flakes were almost never classed correctly, with an overall success rate of only 4% - 4% of the platform flakes were also categorised as bipolar.

Response (%)
Response when
Type was correct (%)
Platform flake,
Platform flake,
Bipolar flake,
None (11) 11 18 18 23 -
Student (7) 17 26 50 31 -
Minor (18) 18 28 40 32 7
Substantial (11) 17 24 23 30 5
All 16 25 31 29 4

Table 7-8 Correct response rate for debitage sub-type when type category was correct

Top of Page

7.3.5 Fragmentation identification

One of the chief difficulties with analysing quartz is due to its fragmentation characteristics. During this quiz, the participants were asked to define the pieces as fragments or complete. The response rate to this question was low (Table 7-9). One of the reasons for this is probably that the participants did not note when they thought a piece was complete, but only when a piece was a fragment – while there were 116 responses of either ‘fragment’ or ‘possible fragment’, there were only six responses of ‘complete’. This possibility is proved by the fact that in the case of debitage when the fragment was not noted, there was no corresponding note of break type either, suggesting that they did indeed deem the piece to be complete. Overall, 36% of the participants made no mention of fragmentation whatsoever; in terms of quartz experience this broke down into 64% for none, 57% for student, 11% for minor, and 36% for substantial. You can see that the latter category bucked the trend by a lower than expected response rate because the more experienced participants generally provided more, and more accurate, responses. Out of the four with substantial experience that gave no response to fragment, three also gave no responses at all to the attribute categories of break type, striking marks, termination and so forth, possibly implying that they treated the quiz at a more general level than their colleagues, or otherwise that they were more hesitant in categorising than their counterparts.

All excluding ‘natural’
and ‘indeterminate’ (%)
Cores excluding ‘natural’
and ‘indeterminate’ (%)
Debitage excluding ‘natural’
and ‘indeterminate’ (%)
‘Blade’ (%) ‘Flake’ (%)
None (11) 9 5 12 25 13
Student (7) 10 5 12 11 19
Minor (18) 18 12 21 15 26
Substantial (11) 25 10 33 11 40
All 17 9 21 16 26

Table 7-9 Response rate for 'fragment' by variable

Overall, the response rate for fragment was 13%. However, this percentage includes responses such as ‘natural’ and ‘indeterminate’ that the respondents wouldn’t be expected to detail the fragmentation of. When these are excluded the overall response for fragmentation was 17%, ranging from 9% for no quartz experience to 25% for substantial experience. For cores the response was 5% for no experience and 10% for substantial, and for debitage, 12% for no experience and for substantial 33%.

Because of this low rate in noting the fragmentation of the pieces it is not straightforward to get an idea of how successful the participants were at classifying the completeness of the pieces. In order to remove extraneous data, one avenue is to only look at responses when the participants thought that they were looking at debitage; this excludes an analysis of the cores as overall the response rate for them was meagre. By doing this, this analysis is giving the participants the benefit of the doubt and allowing the most optimistic result for the category. When this is done, there were 111 responses in total (21% of the possible) with an overall accuracy rate of 13%. Excluding the non-responses to the fragment category, the accuracy in describing a flake fragment as fragment was 89%, suggesting that when a participant thought it was a flake fragment, they noted so; the greater inaccuracy was in misidentifying complete flakes as fragments – here only 33% of the responses were correct.

Top of Page

7.3.6 Modified pieces

The participants were provided with a list of suggested modified types. This list of suggested modified types was included in order to examine the observations of retouch. As discussed previously (Section 4.3), the identification of retouch on quartz can be problematical, and more often than not ‘tools’ and tool types are identified and defined by the presence of retouch. None of the pieces presented to the participants had been modified or retouched in any way. Table 7-10 shows the ‘modified’ categories that were presented on the form and the amount of observations, including observations of ‘possible’ types. (The category ‘trimmed’ relates to Irish Later Mesolithic types such as butt-trimmed and distally trimmed.) In addition to the categories provided on the form, some participants noted instances of ‘modified’ and ‘possible use wear’ as well (Figure 7-3). 9% of the 940 observations noted modification (Table 7-11). However, only 31 of the 47 participants noted at least one modification; for these 31 the observation rate was 14%. For the substantial lithic experience cohort it was 6% overall; for the 52% of substantial lithic experience who noted at least one modification, they noted it 11% of the time. For those with substantial quartz experience it was 5% overall; for the 46% of substantial quartz experience who noted at least one modification, they noted it 10% of the time.

‘Modified’ Observations
Arrowhead 4
Borer 9
Retouched 36
Scraper 35
Trimmed -

Table7-10 Suggested ‘modified’ types and Quiz B observations

Lithic experience and
Quartz experience
Overall observation
rate (%)
% for those
with at least
one observation
All (47) 9 14
Substantial lithic exp. (29) 6 11
Substantial quartz exp. (11) 5 10

Table 7-11 Observation rates for modified pieces

quartz recognition modified observed

Figure 7-3 Modified types observed

quartz recognition lithic experience modified piece observed quartz experience

Figure 7-4 [top] Observations of modification per piece and by lithic experience
Figure 7-5 [bottom] Observations of modification per piece and by quartz experience

Overall, the most common description of modification was either as a generic ‘scraper’ or general ‘retouch’ (Figure 7-3). While a number of arrowheads or possible arrowheads were noted, these were by the student or minor lithic experience cohorts. Figure 7-4 shows the identification of modified piece by individual artefact and lithic experience while Figure 7-5 gives the same by quartz experience. 14 of the 20 artefacts presented were identified as modified, while five of these represented 66% of the observations. The substantial lithic experience cohort identified 12 modified pieces and the substantial quartz experience cohort identified six modified pieces. The substantial quartz experience cohort did not just choose the five most commonly chosen artefacts. Half of the observations were of artefacts which accounted for 20% of the overall observations.

Top of Page

7.3.7 Evaluating skill levels

In order to compare the varying analytical skill levels, logistic regression was used using the combined skill levels of lithic and quartz experience, which gave 10 skill levels ranging from ‘None, None’ to ‘Substantial, Substantial’ following the format of lithic/quartz experience. Logistic regression is used in situations where one wants to be able to predict outcomes based on values of a set of predictor variables, where the response is binary, i.e. yes/no, correct/incorrect (2006b). The ‘Substantial/Substantial’ skill level was used as the reference category to compare to the other skill levels, and the analysis was run on the scores for Piece, Type, and Class.

For the scores to the Piece category, only the Minor, None experience cohort (minor lithic experience and no quartz experience) was significantly different from the Substantial, Substantial experience cohort (Table 7-12) with the odds of the former cohort answering correctly being 51% of the odds of the Substantial, Substantial experience cohort answering correctly.

Wald χ² df p
Experience combined 7.642 9 0.571
None, None 0.224 1 0.636
Student, None 1.889 1 0.169
Student, Student 2.615 1 0.106
Minor, None 5.306 1 0.021
Minor, Student 0.028 1 0.867
Minor, Minor 1.364 1 0.243
Substantial, None 1.037 1 0.309
Substantial, Student 0.224 1 0.636
Substantial, Minor 1.991 1 0.158
Constant 22.677 1 0.000

Table 7-12 Scores for Piece category. Logistic regression. Variable: Experience combined. Reference category: Substantial, Substantial

For the scores to the Type category, again the Minor, None experience cohort was significantly different than the Substantial, Substantial experience cohort, as was the  Minor, Minor experience cohort, albeit with a weak significance (Table 7-13). The odds of the Minor, None experience cohort answering correctly was 49% of the odds of the Substantial, Substantial experience cohort answering correctly, while the odds of Minor, Minor experience cohort answering correctly was 63% of the odds of the Substantial, Substantial experience cohort answering correctly.

Wald χ² df p
Experience combined 9.524 9 0.390
None, None 0.025 1 0.874
Student, None 1.667 1 0.197
Student, Student 0.531 1 0.466
Minor, None 5.363 1 0.021
Minor, Student 0.509 1 0.475
Minor, Minor 3.873 1 0.049
Substantial, None 0.134 1 0.715
Substantial, Student 1.667 1 0.197
Substantial, Minor 1.299 1 0.254
Constant 0.164 1 0.686

Table 7-13 Scores for Type category. Logistic regression. Variable: Experience combined. Reference category: Substantial, Substantial

For the scores to the Class category, only the Substantial, Student experience cohort was significantly different from the Substantial, Substantial experience cohort (Table 7-14), and in this case the odds of the Substantial, Student experience cohort answering correctly was 249% of the odds of the Substantial, Substantial experience cohort answering correctly – in other words the former were more likely to identify the Class correctly than the more experienced cohort.

Wald χ² df p
Experience combined 11.169 9 0.264
None, None 0.008 1 0.931
Student, None 0.000 1 0.998
Student, Student 0.235 1 0.628
Minor, None 0.000 1 0.997
Minor, Student 0.000 1 0.998
Minor, Minor 2.834 1 0.092
Substantial, None 0.630 1 0.427
Substantial, Student 4.324 1 0.038
Substantial, Minor 0.900 1 0.343
Constant 95.000 1 0.000

Table 7-14 Scores for Class category. Logistic regression. Variable: Experience combined. Reference category: Substantial, Substantial

These three analyses highlight that, statistically, there is little difference in the scores, and overall the skill level of the participants is not a good predictor of their ability to identify and classify vein quartz artefacts. While the Minor, None experience cohort were more likely to answer incorrectly for Piece and Type compared to the most experienced cohort, their answers for Class were not significantly different. And while the Minor, Minor experience cohort was more likely to answer incorrectly the Type category compared to the most experienced cohort, the significance level was nevertheless weak. The only cohort that was more likely to outperform the most experienced cohort was the Substantial, Student experience cohort when answering the Class category, where they were 2.5 times more likely to answer correct.

Top of Page

7.4 Overview and discussion

The willingness of a cross-section of attendees at the WAC conference in Dublin to participate in the quartz quiz is commended. And I would like to thank all those who took part; most realised the difficulties in analysing quartz, and a number of people introduced themselves by saying “I hate quartz”, but were nevertheless happy to voluntarily sit down and scrutinise and agonise over a small quartz assemblage. The results of the recognition experiment highlight the significant difficulties in recognising quartz artefacts even for archaeologists with substantial experience in analysing lithic assemblages and specifically quartz assemblages. Those with substantial experience with analysing quartz, however, generally had a greater success rate at both identifying flaked quartz as non-natural and also categorising it into its respective types. Those with substantial quartz experience were also less likely to note occurrences of retouch on the non-retouched artefacts. Nevertheless, the difference in scores was often slight, and analysing with logistic regression highlighted that, apart from a few instances, the skill levels were not a good predictor of the participants to answer correctly.

A limitation of the experiment was the level of non-responses for categories such as fragmentation and core type. In terms of the participants with substantial lithic and quartz experience there was a wide range in the level of non-responses to the artefacts; some provided detailed information regarding most artefacts beyond the classification into types – such as flake termination and break type – while others omitted even the basic classifications such as core type. While this may suggest a hesitance to commit on the part of some, in order to mitigate this the form presented to the participants could have included fill-in-the-blank entries (e.g. see Whittaker et al. 1998) for each requested category – this was only provided for the first category. Providing such entry fields would probably have mitigated the low response rate for some of the participants at least. Nevertheless, as it stands the results have provided interesting insights into how quartz recognition and categorisation is approached by a variety of analytical skill levels.

One of the more surprising and unexpected results of the quiz was the misidentification of the bipolar component of the experimental assemblage. As noted in Chapter 4, the linkage between bipolar technology and quartz is experienced worldwide, so much so that many analysts have needed to stress that the two do not always necessarily have to go together, and indeed, that quartz technology is viable without the use of a bipolar strategy. Given this relationship, it is very surprising that the bipolar artefacts, or at least the bipolar cores, were consistently misidentified and misclassified by the participants, even those with substantial experience with analysing quartz. When formulating the quiz the bipolar cores presented were presumed to be easily identified and all of them were given a difficulty rating of 1. I had surmised that there would be general difficulty in classifying the bipolar flakes as being bipolar. Significantly, when those with substantial quartz experience did judge a piece to be a bipolar core, they were only correct 55% of the time. They were incorrect as they mistook both platform core fragments and a platform flake for bipolar cores.

The platform cores presented to the participants were of varying surmised difficulty. The complete platform core was surmised to be easily identified. This proved to be so, with a high success rate in identifying it. On the other hand, the three platform core fragments were chosen as they were representative of how a core can fragment into pieces that could be taken for debitage due to their morphology, as discussed in Section 6.4.2. This, again, proved to be the case. Looking at those with substantial quartz experience we can see that for one of the core fragments, Exp# 2023, there was general consensus that it was debitage – 82% of the responses regarded it as debitage, with the remaining 18% describing it as indeterminate. For the other two core fragments there was far less consensus; Exp# 2022 was described as a core in 55% of the responses, as debitage in 27%, and indeterminate in 18%, while the last core fragment was described as indeterminate in 45% of the observations, as debitage in 27%, as natural in 18%, and a core in 9%.

Table 7-15 gives the responses for the 11 participants with substantial quartz experience for the categories of piece, type, and sub-type; as the response rate for fragment was so low, this has been excluded. The substantial quartz experience cohort was much less likely to note instances of retouch or modification on the pieces than the overall average. Nevertheless, they still noted numerous observations of retouch or possible retouch – altogether, those with substantial quartz experience noted six of the artefacts as being retouched, with an overall rate of 5%.

As mentioned previously, Lindgren (1998, 99) presented 10 experimentally made artefacts of quartz to 13 of her colleagues, who presumably had experience in analysing quartz artefacts; the participants were asked to classify the ten pieces in any classification system they wished, and not specifically asked to note for retouch or the lack of it.  Of the ten artefacts, six were retouched and the success rate of identification of was 59%. While Lindgren does not mention the identification of retouch on the non-retouched artefacts, her diagram (Lindgren 1998, 100) shows that of the four non-retouched artefacts, one had three observations of retouch while another had one, giving an overall overestimate of retouch at 8% for the four artefacts. Lindgren does not describe whether some participants were more likely to note retouch or the lack of it. While it is difficult to compare this current test with Lindgren’s, her participants’ overestimate rate of 8% is comparable to this experiment’s overestimate rate of 9%.

Table 7 15 Responses of 11 participants with substantial quartz experience. P = Piece; T = Type; S = Sub-type; ? = not answered; � = partially correct

One approach to analysing the performance of the participants in examining the experimental assemblage is to take the collective observations by skill level and using the majority rules principle, allow an assemblage to form. Unfortunately the data on the fragmentation of the pieces is not usable in this context, as too few responses were given. Therefore, what follows does not consider the fragmentation of the assemblage; this of course is a distinct disadvantage, as the fragmentation of quartz is a key issue. For this consensus approach table I have included only those who had either minor or substantial quartz experience, as these two groups generally had a greater degree of consensus, allowing a clearer assemblage to form. For this, I grouped together responses of single, multiplatform, opposed, and dual opposed cores as ‘Core, M/S/O/DO’, and I also grouped all flakes and blades, irregular and regular as ‘Flake/Blade, Ir/R’. To create a consensus table I tallied the responses for each individual piece, and chose the response for each piece by its most common response; for example, if a piece was chosen 33% for generic ‘core’ and 33% for single or multiplatform core, the resultant choice would be a combination called Core/Core, M/S/O/DO.

Table 7-16 gives the results of the consensus assemblages for those with minor and substantial quartz experience as well as the experimental assemblage itself. Looking at the assemblage by the participants with minor quartz experience one can see immediately that there is no evidence for bipolar technology whatsoever, even though bipolar pieces comprised 30% of the experimental assemblage. Similarly, for those with substantial quartz experience, the bipolar component is underrepresented, with only one bipolar core is listed. Ironically, this is actually a misidentification, as the ‘bipolar core’ that they agreed upon is one of the platform core fragments. This is also the case for the flakes and cores – while they have agreed that such an artefact is a flake, it may not actually correspond to an actual flake but rather a core, and vice versa for the cores. What is interesting among these two assemblages is that in using the majority rules principle, a lot of the ‘indeterminate’, ‘natural’, and ‘debris’ categories have been suppressed; for those with substantial experience 10% of the pieces were called indeterminate, none were called natural, and only 5% was called debris. For minor experience, the natural category was used more, with 10% being natural/indeterminate and 5% being natural/all core.

Experimental Count Minor quartz experience Count Substantial quartz experience Count
Flake, Regular 10 Flake/Blade, Ir/R. 11 Flake/Blade, Ir/R. 12
Flake, Bipolar 2 - -
Debris 1
Core, Multiplatform 4 Core/Core, M/S/O/DO 5 Core/Core, M/S/O/DO 4
Core, Bipolar 4 - Core, Bipolar 1
Core, M/S/O/DO/
Core/Natural 1
Natural/Indeterminate 2 Indeterminate 2

Table 7-16 Experimental assemblage and two consensus assemblages

This quartz recognition experiment has therefore shown that the identification and classification of vein quartz artefacts is difficult for all levels of experience, with the most experienced participants not outperforming their colleagues with lesser experience to a great extent. While some of the individual participants fared better than their peers with similar skill levels, taken as a skill level cohort, all the cohorts fared poorly. Some of the artefacts presented to the participants, especially the platform core fragments, were surmised to be difficult to correctly identify and classify and this proved correct. The most surprising misidentification, however, was the bipolar cores which were included in the experiment in order to present the participants with familiar artefacts. This proved incorrect and the consensus assemblage compiled for both minor and substantial quartz experience was almost bereft of a bipolar component, whereas it actually consisted of 30% bipolar artefacts. While none of the artefacts presented to the participants were modified, 14 of the 20 artefacts were deemed to be modified by at least one observation, with five of these making up 66% of the observations of modification. Overall, the participants overestimated the modification of the artefacts at a rate of 9%, which is similar to the overestimation rate noted by Lindgren in her experiment, highlighting the difficulty in ‘reading’ vein quartz artefacts.

[15]This sheet was slightly amended for the Quiz B participants; Appendix A- 43 is the amended version.[return]
[16]'Regular' means a flake or blade with an acute, straight edge of at least 10mm in length, and a blade means a flake with a length:breath ratio of 2:1.[return]

Home     PhD Contents     PhD Chap 1     PhD Chap 2     PhD Chap 3     PhD Chap 4     PhD Chap 5     PhD Chap 6     PhD Chap 7     PhD Chap 8     PhD Chap 9     PhD Chap 10     PhD Chap 11     PhD Bibliography     PhD Appendices     PhD Plans    [PhD PDF]

Top of Page