Pendant-Pendant Sums by Color


1. COLOR - Sum Pendant Colors:

Since the pendant-pendant-sum relationship is the most prevalent sum relationship in the Khipu Field Guide, that sum relationship is useful as a proxy to examine sums and their colors in a broad manner.

2. Pendant Sum Color EDA:

2.1 By Color Frequency

How do pendant color sum cords distribute by color?

Code
#=======================================================
# INITIALIZE
# Read in the Fieldmark and its associated dataframes
#=======================================================
import pandas as pd
import qollqa_chuspa as qc
import plotly

from fieldmark_ascher_pendant_pendant_sum import FieldmarkPendantPendantSum
aFieldmark = FieldmarkPendantPendantSum()
khipu_df = aFieldmark.dataframes[0].dataframe
sum_cord_df = aFieldmark.dataframes[1].dataframe

(khipu_dict, all_khipus) = qc.fetch_khipus()

# Initialize plotly
plotly.offline.init_notebook_mode(connected = False);
Code
from statistics import mean
import math

sum_cord_df = aFieldmark.dataframes[1].dataframe
def pendant_mean_by_color(aColor):
    cords_with_color_df = sum_cord_df[sum_cord_df.cord_color==aColor]
    pendant_color_values = list(cords_with_color_df.cord_value.values)
    the_mean_value = mean(pendant_color_values) if len(pendant_color_values) > 0 else 0
    return math.log(the_mean_value,10) if the_mean_value > 0 else 0

def pendant_color_occurrences(aColor):
    cords_with_color_df = sum_cord_df[sum_cord_df.cord_color==aColor]
    num_occurrences = len(cords_with_color_df)
    return num_occurrences

all_colors = set(sum_cord_df.cord_color.values)
color_frequency = [(aColor, pendant_color_occurrences(aColor), pendant_mean_by_color(aColor)) for aColor in all_colors]
Code
color_frequency_df = pd.DataFrame(color_frequency, columns = ['ascher_color', 'num_occurrences', 'log_mean_value'])
color_frequency_df = color_frequency_df.sort_values(by=['num_occurrences'], ascending=False);
color_frequency_df.head(20)
ascher_color num_occurrences log_mean_value
14 W 1763 2.178977
92 AB 924 2.161368
95 MB 623 1.929419
186 YB 375 2.041393
97 B 304 2.107210
115 W:MB 164 2.103804
143 W:AB 160 2.000000
212 W:KB 108 2.315970
208 AB:MB 105 1.892095
194 KB 88 1.662758
173 DB 88 2.004321
227 RB 87 1.806180
148 LK 82 1.544068
20 NB 73 2.472756
122 GG 61 2.285557
140 YG 52 2.064458
66 LB 47 2.271842
153 AB:GG 47 2.139879
162 BG 46 2.181844
147 HB 41 1.740363
Code
import plotly.express as px

legend_text = "<i style=\"font-size:10pt;\">&nbsp;&nbsp;Size/Color = Mean(Cord Values)" + \
              " - Hover to View Khipu/Cord Info</i>"
fig = px.scatter(color_frequency_df.head(75), x='ascher_color', y='num_occurrences', log_y=True,
                 color=list(color_frequency_df.log_mean_value.values)[:75],
                labels={"color": "Mean Cord Value",'num_occurrences':"Pendant Sum Color Frequency (Log_10)", "log_mean_value": "Log_10 of Mean Cord Value", },
                title=f"Pendant Sum Cords by Color: (Top 75){legend_text}", width=944, height=944).update_layout(showlegend=True).show()
Code
each_color_occurences = list(color_frequency_df.num_occurrences.values)
all_occurrences = float(sum(each_color_occurences))
color_frequency_df['proportions'] = [round(100.0*float(x)/float(all_occurrences)) for x in each_color_occurences]
fig = px.bar(color_frequency_df[:75], x="ascher_color", y="proportions", 
                 labels={"x": "Ascher Color", "y": "% of Overall Pendant Sum Colors"},
                 title="Percent of Pendant Sum Color to Overall Pendant Sums",     
                 width=944, height=400).update_layout(showlegend=False).show()

These two charts tell us that while White is the predominant pendant sum color, it comprises less than 27% of the pendant sum colors. About the same as its overall occurrence as a cord in the Khipu Field Guide.

2.2 Sum Color Frequency vs All Pendant Color Frequency

How does the distribution of pendant sum cord colors compare to the overall distribution of colors of significant cords (ie. cords with a value > 1) in all the khipus? To do that, let’s gather the total distribution of colors across all sum pendants, and then look at the cosine distance of that distribution to the distribution of all pendants with a value > 1.

Code
from collections import OrderedDict, Counter
import utils_loom as uloom

def build_pendant_color_frequencies():
    ascher_colors = []
    for aKhipu in all_khipus:
        khipu_cords = aKhipu.pendant_cords()
        for cord_index, aCord in enumerate(khipu_cords):
            cord_colors = [aColor.full_color for aColor in aCord.ascher_colors]
            if len(cord_colors)>0: ascher_colors += cord_colors
    
    return OrderedDict(Counter(ascher_colors).most_common()) 

pendant_color_frequency_dict = build_pendant_color_frequencies()
pendant_sum_color_frequency_dict = OrderedDict(zip(color_frequency_df.ascher_color.values, color_frequency_df.num_occurrences.values))
theta = uloom.degrees_between_dicts(pendant_color_frequency_dict, pendant_sum_color_frequency_dict)
print(f"Angle between all_colors vector and pendant color vector is {theta:0.2f}° degrees")
Angle between all_colors vector and pendant color vector is 4.91° degrees

That’s astounding. Pendant sum color frequency matches almost exactly (within 5° degrees) to overall pendant color frequency amongst all the khipus. Dr. Jon Clindaniel has posited in his Harvard Ph.D. Thesis that White (or more generally, color) has a grammatical role in sums. If we are to look at the above scatterplot Top 75 Pendant Sum Cords by Color Frequency one can conclude that Dr. Clindaniel’s argument makes sense.

However, the closeness of the vector angle disproves the case. If color was a grammatical marker, we would expect to see a more significant difference. Consequently, the apparent conclusion is that color does not play a grammatical role in summation patterns.

The reverse argument can also be made. All pendants are sum cords. Is that true? Let’s look at just this fieldmark alone - pendant pendant sums, which has the most number of sum cords:

Code
num_khipu_pendants = sum([khipu_dict[khipu_name].num_pendant_cords() for khipu_name in sum_cord_df.kfg_name.values])
num_sum_pendants = len(aFieldmark.dataframes[1].dataframe)
print(f"num pendant sum_cords is {num_sum_pendants}, out of a set of {num_khipu_pendants} total sum khipus pendants")
percentage = (100.0*num_sum_pendants/float(num_khipu_pendants))
print(f"That's a ratio of {percentage:.2f}%")
num pendant sum_cords is 6443, out of a set of 1524290 total sum khipus pendants
That's a ratio of 0.42%

So 0.5% - less than 1/2 of 1% - of the cords in khipus having a pendant pendant sum cord relations ship have pendants that are sum cords. So that argument might have trouble bearing weight…

As a final confirmation - let’s look at the color distributions of the two sets of colors (pendant sum colors) vs (pendant colors). Click on each image to view it full-size.

Click on image to view larger


Click on image to view larger

3. By Mean Cord Value:

Code
sum_cord_df = aFieldmark.dataframes[1].dataframe
def pendant_mean_by_color(aColor):
    cords_with_color_df = sum_cord_df[sum_cord_df.cord_color==aColor]
    pendant_color_values = list(cords_with_color_df.cord_value.values)
    the_mean_value = mean(pendant_color_values) if len(pendant_color_values) > 0 else 0
    return the_mean_value

def pendant_color_occurrences(aColor):
    cords_with_color_df = sum_cord_df[sum_cord_df.cord_color==aColor]
    num_occurrences = len(cords_with_color_df)
    return math.log(num_occurrences,10) if num_occurrences > 0 else 0

sum_cord_colors = list(sum_cord_df.cord_color.values)
sum_color_counter = Counter(sum_cord_colors)
sum_color_frequency = sum_color_counter.most_common()
all_colors = set(sum_cord_colors)
color_means = [(aColor, pendant_mean_by_color(aColor), pendant_color_occurrences(aColor)) for aColor in all_colors]
Code
color_by_value_df = pd.DataFrame(color_means, columns = ['color', 'mean_value', 'log_num_occurrences'])
color_by_value_df = color_by_value_df.sort_values(by=['mean_value'], ascending=False);
color_by_value_df.head(20)
color mean_value log_num_occurrences
32 B:PB 1000 0.000000
25 W:BD 932 1.204120
196 B:GG 891 0.301030
31 B-YG 822 0.000000
203 B:GG:LB 670 0.000000
114 W-DB 546 1.079181
96 W%MB 516 1.041393
201 PK 428 1.204120
163 W%AB 385 0.698970
116 GL-MB 380 0.000000
136 HB-KB 370 0.000000
180 LG 348 0.903090
145 BB-CB 336 0.000000
210 YG:DB 330 0.301030
181 RG 300 0.301030
20 NB 297 1.863323
189 W:YB 291 1.477121
133 YB:BS 285 1.113943
38 W-MB 280 1.000000
47 W-GG 267 0.602060
Code
fig = px.scatter(color_by_value_df.head(75), x='color', y='mean_value', 
             size='log_num_occurrences', 
             color='log_num_occurrences', 
             labels={"color": "Ascher Color", "mean_value": "Mean Cord Value", "log_num_occurrences": "Log_10 of # Color occurrences",},
             title="Top 75 Pendant Sum Cords by Mean Cord Value per Color ", width=944, height=944).update_layout(showlegend=True).show()

Click on image to view larger

Note that as Dr. Clindaniel also noted in his thesis - the majority of the large-valued ascher cord colors in sums are barberpole or mottled - few are solid:

Code
def is_solid_ascher_color(aColor): 
    return isinstance(aColor,str) and not (('-' in aColor) or (':' in aColor) or ('%' in aColor))

all_colors = OrderedDict(list(zip(list(color_by_value_df.color.values), list(color_by_value_df.mean_value.values))))
complex_colors = OrderedDict([(color, count) for (color, count) in all_colors.items() if not is_solid_ascher_color(color)])
solid_colors = OrderedDict([(color, count) for (color, count) in all_colors.items() if is_solid_ascher_color(color)])

Click on image to view larger


Click on image to view larger

4. By Banded vs Seriated

Code
def banded_color(kfg_name): 
    aKhipu = khipu_dict[kfg_name]
    is_banded = aKhipu.num_banded_groups() > khipu_dict[kfg_name].num_seriated_groups()
    return 0.0 if is_banded else 1.0
def banded_ratio(kfg_name):
    aKhipu = khipu_dict[kfg_name]
    total_groups = aKhipu.num_cord_groups()
    return aKhipu.num_banded_groups()/total_groups if total_groups > 0 else 0

khipu_df['banded_color'] = [banded_color(x) for x in khipu_df.kfg_name.values]
khipu_df['banded_ratio'] = [banded_ratio(x) for x in khipu_df.kfg_name.values]
khipu_df['num_banded_groups'] = [khipu_dict[x].num_banded_groups() for x in khipu_df.kfg_name.values]
khipu_df['num_seriated_groups'] = [khipu_dict[x].num_seriated_groups() for x in khipu_df.kfg_name.values]

fig = px.scatter(khipu_df, 
                 x='num_seriated_groups', y='num_banded_groups', 
                 hover_name='kfg_name', hover_data=["num_banded_groups", "num_seriated_groups", "num_sum_cords"],
                 size="num_sum_cords", 
                 color='banded_color', color_continuous_scale=['#3c3fff', '#ff3030',],
                 labels={"num_sum_cords":"Number of Pendant Pendant Sum Cords"},
                 title=f"<b>Pendant Pendant Sums by Banded/Seriated</b> - Blue=Banded, Red=Seriated, Size=#SumCords",
                 width=944, height=944).update_coloraxes(showscale=False).show()

5. Conclusions:

  • A majority of large-valued pendant sum cords have a seriated color scheme. This provides an additional confirmation of Dr. Clindaniel’s argument in his Ph.D. Thesis the majority of the large-valued ascher cord colors in sums are barberpole or mottled - few are solid.
  • White is the most common pendant sum cord color (<30%), followed by AB(Light Brown), MB(Moderate Brown), and B(Moderate Yellowish Brown) and (YB)Light Yellowish Brown.
  • White does not appear to play a role as a grammatical marker for pendant sum summation. More examination of this can be found on the White Cord EDA page.

See here for color name descriptions.