Pendant-Pendant Sums by Color

1. COLOR - Sum Pendant Colors:

Since the pendant-pendant-sum relationship is the most prevalent sum relationship in the Khipu Field Guide, that sum relationship is useful as a proxy to examine sums and their colors in a broad manner.

2. Pendant Sum Color EDA:

2.1 By Color Frequency

How do pendant color sum cords distribute by color?

Code

#=======================================================
# INITIALIZE
# Read in the Fieldmark and its associated dataframes
#=======================================================
import pandas as pd
import qollqa_chuspa as qc
import plotly

from fieldmark_ascher_pendant_pendant_sum import FieldmarkPendantPendantSum
aFieldmark = FieldmarkPendantPendantSum()
khipu_df = aFieldmark.dataframes[0].dataframe
sum_cord_df = aFieldmark.dataframes[1].dataframe

(khipu_dict, all_khipus) = qc.fetch_khipus()

# Initialize plotly
plotly.offline.init_notebook_mode(connected = False);

Code

from statistics import mean
import math

sum_cord_df = aFieldmark.dataframes[1].dataframe
def pendant_mean_by_color(aColor):
    cords_with_color_df = sum_cord_df[sum_cord_df.cord_color==aColor]
    pendant_color_values = list(cords_with_color_df.cord_value.values)
    the_mean_value = mean(pendant_color_values) if len(pendant_color_values) > 0 else 0
    return math.log(the_mean_value,10) if the_mean_value > 0 else 0

def pendant_color_occurrences(aColor):
    cords_with_color_df = sum_cord_df[sum_cord_df.cord_color==aColor]
    num_occurrences = len(cords_with_color_df)
    return num_occurrences

all_colors = set(sum_cord_df.cord_color.values)
color_frequency = [(aColor, pendant_color_occurrences(aColor), pendant_mean_by_color(aColor)) for aColor in all_colors]

Code

color_frequency_df = pd.DataFrame(color_frequency, columns = ['ascher_color', 'num_occurrences', 'log_mean_value'])
color_frequency_df = color_frequency_df.sort_values(by=['num_occurrences'], ascending=False);
color_frequency_df.head(20)

	ascher_color	num_occurrences	log_mean_value
14	W	1763	2.178977
92	AB	924	2.161368
95	MB	623	1.929419
186	YB	375	2.041393
97	B	304	2.107210
115	W:MB	164	2.103804
143	W:AB	160	2.000000
212	W:KB	108	2.315970
208	AB:MB	105	1.892095
194	KB	88	1.662758
173	DB	88	2.004321
227	RB	87	1.806180
148	LK	82	1.544068
20	NB	73	2.472756
122	GG	61	2.285557
140	YG	52	2.064458
66	LB	47	2.271842
153	AB:GG	47	2.139879
162	BG	46	2.181844
147	HB	41	1.740363

Code

import plotly.express as px

legend_text = "<i style=\"font-size:10pt;\">&nbsp;&nbsp;Size/Color = Mean(Cord Values)" + \
              " - Hover to View Khipu/Cord Info</i>"
fig = px.scatter(color_frequency_df.head(75), x='ascher_color', y='num_occurrences', log_y=True,
                 color=list(color_frequency_df.log_mean_value.values)[:75],
                labels={"color": "Mean Cord Value",'num_occurrences':"Pendant Sum Color Frequency (Log_10)", "log_mean_value": "Log_10 of Mean Cord Value", },
                title=f"Pendant Sum Cords by Color: (Top 75){legend_text}", width=944, height=944).update_layout(showlegend=True).show()

Code

each_color_occurences = list(color_frequency_df.num_occurrences.values)
all_occurrences = float(sum(each_color_occurences))
color_frequency_df['proportions'] = [round(100.0*float(x)/float(all_occurrences)) for x in each_color_occurences]
fig = px.bar(color_frequency_df[:75], x="ascher_color", y="proportions", 
                 labels={"x": "Ascher Color", "y": "% of Overall Pendant Sum Colors"},
                 title="Percent of Pendant Sum Color to Overall Pendant Sums",     
                 width=944, height=400).update_layout(showlegend=False).show()

These two charts tell us that while White is the predominant pendant sum color, it comprises less than 27% of the pendant sum colors. About the same as its overall occurrence as a cord in the Khipu Field Guide.

2.2 Sum Color Frequency vs All Pendant Color Frequency

How does the distribution of pendant sum cord colors compare to the overall distribution of colors of significant cords (ie. cords with a value > 1) in all the khipus? To do that, let’s gather the total distribution of colors across all sum pendants, and then look at the cosine distance of that distribution to the distribution of all pendants with a value > 1.

Code

from collections import OrderedDict, Counter
import utils_loom as uloom

def build_pendant_color_frequencies():
    ascher_colors = []
    for aKhipu in all_khipus:
        khipu_cords = aKhipu.pendant_cords()
        for cord_index, aCord in enumerate(khipu_cords):
            cord_colors = [aColor.full_color for aColor in aCord.ascher_colors]
            if len(cord_colors)>0: ascher_colors += cord_colors
    
    return OrderedDict(Counter(ascher_colors).most_common()) 

pendant_color_frequency_dict = build_pendant_color_frequencies()
pendant_sum_color_frequency_dict = OrderedDict(zip(color_frequency_df.ascher_color.values, color_frequency_df.num_occurrences.values))
theta = uloom.degrees_between_dicts(pendant_color_frequency_dict, pendant_sum_color_frequency_dict)
print(f"Angle between all_colors vector and pendant color vector is {theta:0.2f}° degrees")

Angle between all_colors vector and pendant color vector is 4.91° degrees

That’s astounding. Pendant sum color frequency matches almost exactly (within 5° degrees) to overall pendant color frequency amongst all the khipus. Dr. Jon Clindaniel has posited in his Harvard Ph.D. Thesis that White (or more generally, color) has a grammatical role in sums. If we are to look at the above scatterplot Top 75 Pendant Sum Cords by Color Frequency one can conclude that Dr. Clindaniel’s argument makes sense.

However, the closeness of the vector angle disproves the case. If color was a grammatical marker, we would expect to see a more significant difference. Consequently, the apparent conclusion is that color does not play a grammatical role in summation patterns.

The reverse argument can also be made. All pendants are sum cords. Is that true? Let’s look at just this fieldmark alone - pendant pendant sums, which has the most number of sum cords:

Code

num_khipu_pendants = sum([khipu_dict[khipu_name].num_pendant_cords() for khipu_name in sum_cord_df.kfg_name.values])
num_sum_pendants = len(aFieldmark.dataframes[1].dataframe)
print(f"num pendant sum_cords is {num_sum_pendants}, out of a set of {num_khipu_pendants} total sum khipus pendants")
percentage = (100.0*num_sum_pendants/float(num_khipu_pendants))
print(f"That's a ratio of {percentage:.2f}%")

num pendant sum_cords is 6443, out of a set of 1524290 total sum khipus pendants
That's a ratio of 0.42%

So 0.5% - less than 1/2 of 1% - of the cords in khipus having a pendant pendant sum cord relations ship have pendants that are sum cords. So that argument might have trouble bearing weight…

As a final confirmation - let’s look at the color distributions of the two sets of colors (pendant sum colors) vs (pendant colors). Click on each image to view it full-size.

Click on image to view larger

Click on image to view larger

3. By Mean Cord Value:

Code

sum_cord_df = aFieldmark.dataframes[1].dataframe
def pendant_mean_by_color(aColor):
    cords_with_color_df = sum_cord_df[sum_cord_df.cord_color==aColor]
    pendant_color_values = list(cords_with_color_df.cord_value.values)
    the_mean_value = mean(pendant_color_values) if len(pendant_color_values) > 0 else 0
    return the_mean_value

def pendant_color_occurrences(aColor):
    cords_with_color_df = sum_cord_df[sum_cord_df.cord_color==aColor]
    num_occurrences = len(cords_with_color_df)
    return math.log(num_occurrences,10) if num_occurrences > 0 else 0

sum_cord_colors = list(sum_cord_df.cord_color.values)
sum_color_counter = Counter(sum_cord_colors)
sum_color_frequency = sum_color_counter.most_common()
all_colors = set(sum_cord_colors)
color_means = [(aColor, pendant_mean_by_color(aColor), pendant_color_occurrences(aColor)) for aColor in all_colors]

Code

color_by_value_df = pd.DataFrame(color_means, columns = ['color', 'mean_value', 'log_num_occurrences'])
color_by_value_df = color_by_value_df.sort_values(by=['mean_value'], ascending=False);
color_by_value_df.head(20)

	color	mean_value	log_num_occurrences
32	B:PB	1000	0.000000
25	W:BD	932	1.204120
196	B:GG	891	0.301030
31	B-YG	822	0.000000
203	B:GG:LB	670	0.000000
114	W-DB	546	1.079181
96	W%MB	516	1.041393
201	PK	428	1.204120
163	W%AB	385	0.698970
116	GL-MB	380	0.000000
136	HB-KB	370	0.000000
180	LG	348	0.903090
145	BB-CB	336	0.000000
210	YG:DB	330	0.301030
181	RG	300	0.301030
20	NB	297	1.863323
189	W:YB	291	1.477121
133	YB:BS	285	1.113943
38	W-MB	280	1.000000
47	W-GG	267	0.602060

Code

fig = px.scatter(color_by_value_df.head(75), x='color', y='mean_value', 
             size='log_num_occurrences', 
             color='log_num_occurrences', 
             labels={"color": "Ascher Color", "mean_value": "Mean Cord Value", "log_num_occurrences": "Log_10 of # Color occurrences",},
             title="Top 75 Pendant Sum Cords by Mean Cord Value per Color ", width=944, height=944).update_layout(showlegend=True).show()

Click on image to view larger

Note that as Dr. Clindaniel also noted in his thesis - the majority of the large-valued ascher cord colors in sums are barberpole or mottled - few are solid:

Code

def is_solid_ascher_color(aColor): 
    return isinstance(aColor,str) and not (('-' in aColor) or (':' in aColor) or ('%' in aColor))

all_colors = OrderedDict(list(zip(list(color_by_value_df.color.values), list(color_by_value_df.mean_value.values))))
complex_colors = OrderedDict([(color, count) for (color, count) in all_colors.items() if not is_solid_ascher_color(color)])
solid_colors = OrderedDict([(color, count) for (color, count) in all_colors.items() if is_solid_ascher_color(color)])

Click on image to view larger

Click on image to view larger

4. By Banded vs Seriated

Code

def banded_color(kfg_name): 
    aKhipu = khipu_dict[kfg_name]
    is_banded = aKhipu.num_banded_groups() > khipu_dict[kfg_name].num_seriated_groups()
    return 0.0 if is_banded else 1.0
def banded_ratio(kfg_name):
    aKhipu = khipu_dict[kfg_name]
    total_groups = aKhipu.num_cord_groups()
    return aKhipu.num_banded_groups()/total_groups if total_groups > 0 else 0

khipu_df['banded_color'] = [banded_color(x) for x in khipu_df.kfg_name.values]
khipu_df['banded_ratio'] = [banded_ratio(x) for x in khipu_df.kfg_name.values]
khipu_df['num_banded_groups'] = [khipu_dict[x].num_banded_groups() for x in khipu_df.kfg_name.values]
khipu_df['num_seriated_groups'] = [khipu_dict[x].num_seriated_groups() for x in khipu_df.kfg_name.values]

fig = px.scatter(khipu_df, 
                 x='num_seriated_groups', y='num_banded_groups', 
                 hover_name='kfg_name', hover_data=["num_banded_groups", "num_seriated_groups", "num_sum_cords"],
                 size="num_sum_cords", 
                 color='banded_color', color_continuous_scale=['#3c3fff', '#ff3030',],
                 labels={"num_sum_cords":"Number of Pendant Pendant Sum Cords"},
                 title=f"<b>Pendant Pendant Sums by Banded/Seriated</b> - Blue=Banded, Red=Seriated, Size=#SumCords",
                 width=944, height=944).update_coloraxes(showscale=False).show()

5. Conclusions:

A majority of large-valued pendant sum cords have a seriated color scheme. This provides an additional confirmation of Dr. Clindaniel’s argument in his Ph.D. Thesis the majority of the large-valued ascher cord colors in sums are barberpole or mottled - few are solid.
White is the most common pendant sum cord color (<30%), followed by AB(Light Brown), MB(Moderate Brown), and B(Moderate Yellowish Brown) and (YB)Light Yellowish Brown.
White does not appear to play a role as a grammatical marker for pendant sum summation. More examination of this can be found on the White Cord EDA page.

See here for color name descriptions.