peter65374/openbuddy-llemma-34b-gguf ✓🔢📝 → 📝

▶️ 275 runs 📅 Oct 2023 ⚙️ Cog 0.8.6 🔗 GitHub 📄 Paper ⚖️ License

4-bit-quantization language-model math-tutor multilingual text-generation

About

This is a cog implementation of "openbuddy-llemma-34b" 4-bit quantization model.

Example Output

Prompt:

"如果西瓜是1，菠菜是a,萝卜是b,香蕉是2,西红柿是c,葡萄是3，那么土豆是什么?"

Output

根据题目中给出的水果与数字对应关系，我们可以列出一个表格来进行分析：

水果	数字
西瓜	1
菠菜	a
萝卜	b
香蕉	2
西红柿	c
葡萄	3

从这个表格中，我们可以看出数字与水果之间并没有一种确定的对应关系。因此，无法通过给出的信息确定土豆所对应的数字或者字母。如果题目中提供了更多信息，我们可以进行进一步的分析和推理。

Performance Metrics

9.18s Prediction Time

120.50s Total Time

All Input Parameters

{
  "debug": true,
  "top_k": 40,
  "top_p": 0.95,
  "prompt": "如果西瓜是1，菠菜是a,萝卜是b,香蕉是2,西红柿是c,葡萄是3，那么土豆是什么?",
  "do_sample": true,
  "num_beams": 1,
  "temperature": 0.7,
  "padding_mode": true,
  "max_new_tokens": 1024,
  "prompt_template": "You are a helpful high school Math tutor. If you don't know the answer to a question, please don't share false information. You can speak fluently in many languages.\nUser: Hi\nAssistant: Hello, how can I help you?</s>\nUser: {prompt}\nAssistant:",
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "repetition_penalty": 1.1
}

Input Parameters

debug Type: booleanDefault: false: provide debugging output in logs
top_k Type: integerDefault: 40Range: 1 - 100: The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.95Range: 0.01 - 1: A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt (required) Type: string
do_sample Type: booleanDefault: true: Whether or not to use sampling ; use greedy decoding otherwise.
num_beams Type: integerDefault: 1Range: 1 - 10: Number of beams for beam search. 1 means no beam search.
temperature Type: numberDefault: 0.7: The value used to modulate the next token probabilities.
padding_mode Type: booleanDefault: true: Whether to pad the left side of the prompt with eos token.
max_new_tokens Type: integerDefault: 1024Range: 1 - 3500: The maximum number of tokens the model should generate as output.
stop_sequences Type: string: A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
prompt_template Type: stringDefault: You are a helpful high school Math tutor. If you don't know the answer to a question, please don't share false information. You can speak fluently in many languages. User: Hi Assistant: Hello, how can I help you?</s> User: {prompt} Assistant:: The template used to format the prompt. The input prompt is inserted into the template using the `{prompt}` placeholder.
presence_penalty Type: numberDefault: 0: Repeat alpha presence penalty (default: 0.0, 0.0 = disabled).
frequency_penalty Type: numberDefault: 0: Repeat alpha frequency penalty (default: 0.0, 0.0 = disabled)
repetition_penalty Type: numberDefault: 1.1Range: 0.01 - 5: Repetition penalty, (float, *optional*, defaults to 1.0): The parameter for repetition penalty. 1.0 means no penalty. values greater than 1 discourage repetition, less than 1 encourage it. See [this paper](https://arxiv.org/pdf/1909.05858.pdf) for more details.

Output Schema

Output

Type: array • Items Type: string

Example Execution Logs

Your formatted prompt is:
You are a helpful high school Math tutor. If you don't know the answer to a question, please don't share false information. You can speak fluently in many languages.
User: Hi
Assistant: Hello, how can I help you?</s>
User: 如果西瓜是1，菠菜是a,萝卜是b,香蕉是2,西红柿是c,葡萄是3，那么土豆是什么?
Assistant:
Inference starting...
after initialization, first token took 0.384
1:
2:根
3:据
4:题
5:目
6:中
7:给
8:出
9:的
10:水
11:果
12:与
13:数
14:字
15:对
16:应
17:关
18:系
19:，
20:我
21:们
22:可
23:以
24:列
25:出
26:一
27:个
28:表
29:格
30:来
31:进
32:行
33:分
34:析
35:：
36:
37:
38:|
39:
40:水
41:果
42:
43: |
44:
45:数
46:字
47: |
48:
49:|
50: :
51:-----
52::
53: |
54: :
55:--
56::
57: |
58:
59:|
60:
61:西
62:瓜
63:
64: |
65:
66:1
67:
68: |
69:
70:|
71:
72:菠
73:菜
74:
75: |
76:
77: a
78:
79: |
80:
81:|
82:
83:萝
84:卜
85:
86: |
87:
88: b
89:
90: |
91:
92:|
93:
94:香
95:蕉
96:
97: |
98:
99:2
100:
101: |
102:
103:|
104:
105:西
106:红
107:柿
108: |
109:
110: c
111:
112: |
113:
114:|
115:
116:葡
117:萄
118:
119: |
120:
121:3
122:
123: |
124:
125:
126:从
127:这
128:个
129:表
130:格
131:中
132:，
133:我
134:们
135:可
136:以
137:看
138:出
139:数
140:字
141:与
142:水
143:果
144:之
145:间
146:并
147:没
148:有
149:一
150:种
151:确
152:定
153:的
154:对
155:应
156:关
157:系
158:。
159:因
160:此
161:，
162:无
163:法
164:通
165:过
166:给
167:出
168:的
169:信
170:息
171:确
172:定
173:土
174:豆
175:所
176:对
177:应
178:的
179:数
180:字
181:或
182:者
183:字
184:母
185:。
186:如
187:果
188:题
189:目
190:中
191:提
192:供
193:了
194:更
195:多
196:信
197:息
198:，
199:我
200:们
201:可
202:以
203:进
204:行
205:进
206:一
207:步
208:的
209:分
210:析
211:和
212:推
213:理
214:。
215:
llama_print_timings:        load time =   380.20 ms
llama_print_timings:      sample time =   110.95 ms /   216 runs   (    0.51 ms per token,  1946.88 tokens per second)
llama_print_timings: prompt eval time =   380.11 ms /   108 tokens (    3.52 ms per token,   284.13 tokens per second)
llama_print_timings:        eval time =  8179.60 ms /   215 runs   (   38.04 ms per token,    26.28 tokens per second)
llama_print_timings:       total time =  9112.99 ms
Final output: 根据题目中给出的水果与数字对应关系，我们可以列出一个表格来进行分析：
|  水果   | 数字 |
| :-----: | :--: |
|   西瓜   |   1  |
|   菠菜   |   a  |
|   萝卜   |   b  |
|   香蕉   |   2  |
|   西红柿 |   c  |
|   葡萄   |   3  |
从这个表格中，我们可以看出数字与水果之间并没有一种确定的对应关系。因此，无法通过给出的信息确定土豆所对应的数字或者字母。如果题目中提供了更多信息，我们可以进行进一步的分析和推理。
Generated in 9.114473944995552 seconds.
Tokens per second: 23.59
Tokens per second not including time to first token: 24.51
cur memory: 0
max allocated: 0
peak memory: 0

Version Details

Version ID: 5ebb3f4327fb859c1f8197e473b9dbcf2c4d41a91cf6ea10225eb40c49161200
Version Created: October 29, 2023

Run on Replicate →